CN110060331A - Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks - Google Patents

Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks Download PDF

Info

Publication number
CN110060331A
CN110060331A CN201910193450.XA CN201910193450A CN110060331A CN 110060331 A CN110060331 A CN 110060331A CN 201910193450 A CN201910193450 A CN 201910193450A CN 110060331 A CN110060331 A CN 110060331A
Authority
CN
China
Prior art keywords
picture
plane
pixel
convolutional neural
neural networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910193450.XA
Other languages
Chinese (zh)
Inventor
颜成钢
徐浙峰
任浩帆
孙垚棋
张继勇
张勇东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201910193450.XA priority Critical patent/CN110060331A/en
Publication of CN110060331A publication Critical patent/CN110060331A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Computer Graphics (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses three-dimensional rebuilding methods outside a kind of monocular camera room based on full convolutional neural networks.The present invention is the following steps are included: step 1, the full convolutional neural networks of training in the way of supervised learning;Step 2 carries out estimation of Depth to each picture with full convolutional neural networks;A series of continuous picture of outdoor scenes is shot with monocular camera, then using each picture as input, estimation of Depth is carried out to picture with the trained full convolutional neural networks in front, obtains its three-dimensional point cloud model;The threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm by step 3.The present invention solves the problems, such as the three-dimensional reconstruction of monocular camera, and can realize on the hardware systems such as ordinary PC or work station through the invention.

Description

Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks
Technical field
The invention belongs to computer visions, computer graphics techniques field, and particularly, the present invention relates to one kind based on complete Three-dimensional rebuilding method outside the monocular camera room of convolutional neural networks.
Background technique
Three-dimensional reconstruction is an important and basic problem in computer vision and computer graphics field, it is in agriculture The fields such as industry, medical treatment, space flight, military affairs, environmental observation, landform exploration have a very wide range of applications.And one of those small point Branch --- carrying out outdoor three-dimensional reconstruction to City scenarios can then play an important role in fields such as digital map navigation, urban plannings.? It crosses the river after the three-dimensional map in city, people can easily check the sample in any one of city corner by various electronic equipments Son, Google Maps are exactly a very successful example in this respect.By being combined with virtual reality and augmented reality, then The functions such as integrated living information, e-commerce, virtual community and service, can bring people's experience of more immersion.Therefore, The research of outdoor three-dimensional reconstruction has high scientific research and application value.
In field of Computer Graphics, the three-dimensional reconstruction of monocular camera is always one important and challenging ask Topic.Although monocular camera cannot pass through range of triangle as binocular camera and depth camera or ToF, structure light principle are direct The depth information of each pixel is obtained, but passes through long-run development, the technology relative maturity of monocular camera, cost is relatively low, knot Structure is simple, to the of less demanding of computing resource, it is easier to be commercialized, such as manpower one smart phone standard configuration camera just It is good monocular camera.Therefore, the method for the invention trained full convolutional Neural net in the way of by supervised learning Network carries out estimation of Depth to each picture that monocular camera obtains, and is then fused into a complete threedimensional model, from And complete three-dimensional reconstruction.
Summary of the invention
The present invention is intended to provide a kind of useful solution.It is an object of the invention to solve the three of monocular camera thus Problems of Reconstruction is tieed up, input is the picture of multiple outdoor scenes shot by monocular camera, and the method in invention is individually right Each picture carries out estimation of Depth, is finally fused into a complete threedimensional model.
The present invention propose the process of realization the following steps are included:
Step 1, the full convolutional neural networks of training in the way of supervised learning;
Step 2 carries out estimation of Depth to each picture with full convolutional neural networks;
A series of continuous picture of outdoor scenes is shot with monocular camera, then using each picture as input, is used The trained full convolutional neural networks in front carry out estimation of Depth to picture, obtain its three-dimensional point cloud model;
The threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm by step 3;
The step 1 is implemented as follows:
1-1. prepares a large amount of training pictures for training mesh parameter;
Each group of trained picture includes a common color image of the outdoor scene shooting to a certain angle and is somebody's turn to do The semantic segmentation information of color image corresponding depth picture and Pixel-level;Pass through the semanteme of Pixel-level in SYNTHIA data set Segmentation information rejects redundant data;
1-2. carries out mathematical modeling to image data;WithIndicate the N group color image and depth in data set Picture, and known camera internal reference matrix K;For color image IiIn any one pixel q, its homogeneous coordinates are [x,y,1]T, T expression transposition;Then it corresponding point Q is calculated with formula once in three dimensions:
Q=Di(q)·K-1 qFormula 1
Assuming that the normal vector of a plane in three-dimensional space isIndicate the real vector of 1*3;In order to make The normal vector of each plane is uniquely that n calculation is as follows:
The unit normal vector for indicating plane, is directed toward plane from origin;D indicates plane with a distance from origin;If Point Q is in some plane, then is met
Assuming that color image IiIn have M plane, then to color image construct a pixels probability matrix Si;S thereini It (q) is the vector of one (M+1) dimension, its j-th of element is denoted asIndicate that pixel q falls in the probability of j-th of plane, Indicate non-planar with j=0 simultaneously;The plane parameter of the i-th picture can be obtained by minimizing following objective function
Wherein,For regular terms, network generates unessential result in order to preventI.e. all Pixel is all grouped into non-planar;α is then learning rate;When pixel q is projected to three-dimensional space from a picture, its institute is right Due to perspective structure, one is scheduled on from a ray of q point in the three-dimensional space answered;Remember the intersection point of ray and plane Depth is λ, and the three-dimensional coordinate of pixel q spatially is λ k-1q;So
For regular termsIt is calculated with following formula:
WhereinIndicate that pixel q falls probability in the plane, value range [0, 1];
Semantic information in data set is divided into two classes: " reservation "={ building, road, pavement, lane line } and " house Abandon "={ pedestrian, automobile, sky, bicycle };If a pixel belongs to " reservation " class, z (q)=1 is enabled;If If belonging to " giving up " class, then z (q)=0 is enabled;Then regular terms formula above is rewritten are as follows:
Full convolutional neural networks are divided into two large divisions: a part is used to divide the plane in picture;Another part is then For generating the three-dimensional point cloud model of picture;The identical abstract characteristic pattern of two partial sharings.
The threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm described in step 3, is had Body is accomplished by
3-1. solves the lap of two clouds;
The characteristic point for being extracted and being matched first two pictures using SIFT algorithm obtains matching point set Q and Q ';For this two Transformation between a point set acquires its homography matrix H, i.e. Q '=HQ;
Then four apex coordinates for calculating registration figure, then carry out image registration, and then obtain being overlapped in two pictures The pixel set in region, then the pixel for wherein belonging to " reservation " classification is only retained by semantic information, finally obtain set of pixels Conjunction N '=1 ..., n ' };
For putting cloud known to two, overlapping region can be expressed as:
P={ p1,...,pn′P '={ p '1,...,p′n′Formula 9
3-2. finds the spin matrix R and translation matrix t of an European transformation, and two clouds are matched, it may be assumed that
R and t are solved using ICP algorithm, acquires R and t by making error sum of squares reach minimum, i.e.,
Spin matrix R, the centroid position of two groups of point clouds are calculated first;
Then calculate every group of point cloud midpoint removes center-of-mass coordinate qi and q 'i:
qi=pi-p,q′i=p 'i- p formula 13
Define matrixW is 3 × 3 matrixes, carries out SVD decomposition to W, obtains:
W=U ∑ VT
Then R is
R=UVT
Then translation matrix t is calculated
T=p-Rp ';
3-3. is translated, after rotation transformation, will be under the point Cloud transform to P coordinate system in P ' with following formula:
To realize the fusion of two cloudsThis operation is all taken to all point clouds, until leaving behind one Three-dimensional point cloud model, so as to complete the three-dimensional reconstruction of entire outdoor scene.
The features of the present invention and the utility model has the advantages that
The present invention realizes three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks, to Three-dimensional Gravity Have greater significance.The present invention full convolutional neural networks of training in the way of supervised learning can directly carry out color image Estimation of Depth obtains its three-dimensional point cloud model, then merges to all point cloud models, completes the Three-dimensional Gravity to outdoor scene It builds.
Compared with binocular camera and depth camera, monocular camera passes through long-run development, and technology relative maturity, cost is relatively low, Structure is simple, high the range of triangle unlike needed for binocular camera of the requirement to computing resource, it is easier to be commercialized.For example, Almost the camera of all standard configurations is exactly monocular camera on manpower one smart phone now, and the imaging effect of camera is not Mistake can directly bring use.
This technology can be realized on the hardware systems such as ordinary PC or work station.
Detailed description of the invention
Fig. 1 is the method for the present invention overview flow chart.
Fig. 2 is the model of full convolutional neural networks used in the present invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real Applying mode, the present invention is described in further detail.
As shown in Figure 1, three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks, including following step It is rapid:
Step 1, the full convolutional neural networks of training in the way of supervised learning:
It is the same with others neural network model, it is necessary first to prepare a large amount of training picture for training mesh parameter. Each group of trained picture includes a common color image and the color image for the outdoor scene shooting to a certain angle The semantic segmentation information of corresponding depth picture and Pixel-level.Due to data set artificially collect with label can spend a large amount of when Between and energy, SYNTHIA data set can be used.Although data therein are all from virtual city, computer simulation Effect and real world out has certain similitude.It should be noted that the initial purpose of this data set is for automatic It drives, the mode of acquisition data is to simulate an automobile in true traffic behavior downward driving, at regular intervals from vehicle One fixed position and angle shot photo.Almost the same picture is much organized so having in data set.Can by speed come The data for rejecting redundancy, to avoid meaningless calculation amount.In addition to this, it is also necessary to reject unwanted part in picture.Figure The information such as pedestrian, automobile in piece are needed not necessarily include in the threedimensional model rebuild, and road, building surface etc. information Must then it retain.The step can be completed by the semantic segmentation information of Pixel-level in SYNTHIA data set, detailed process will It is embodied in following regular terms.
Before introducing neural network model, need to carry out mathematical modeling to image data.WithIndicate data The N group color image and depth picture of concentration, and known camera internal reference matrix K.For color image IiIn any one Pixel q, its homogeneous coordinates are [x, y, 1]T, T expression transposition.Then its corresponding point Q formula once in three dimensions It is calculated:
Q=Di(q)·K-1Q formula 1
Since data almost all of during three-dimensional reconstruction are all about plane information.Assuming that in three-dimensional space The normal vector of a plane beIndicate the real vector of 1*3;Normal vector in order to make each plane is Uniquely, n is calculated in the following way:
The unit normal vector for indicating plane, is directed toward plane from origin;D indicates plane with a distance from origin.If Point Q is in some plane, then meets nTQ=1.
Assuming that color image IiIn have M plane, then to color image construct a pixels probability matrix Si.S thereini It (q) is the vector of one (M+1) dimension, its j-th of element is denoted asIndicate that pixel q falls in the probability of j-th of plane, together When with j=0 indicate non-planar.The plane parameter of the i-th picture can be obtained by minimizing following objective function
Wherein,For regular terms, network generates unessential result in order to preventI.e. all Pixel is all grouped into non-planar;α is then learning rate.When pixel q is projected to three-dimensional space from a picture, its institute is right Due to perspective structure, one is scheduled on from a ray of q point in the three-dimensional space answered.Remember the intersection point of ray and plane Depth is λ, and the three-dimensional coordinate of pixel q spatially is λ k-1q.So
For regular termsIt can be calculated with following formula:
WhereinIndicate the probability that pixel q is fallen in plane (regardless of which plane), Value range is in [0,1].It should be noted that not all pixel will participate in this when three-dimensional reconstruction Process.Possess different semantic informations pixel logically whether need reconstructed probability be it is different, such as road, The pixel of the semantic informations such as external wall should just be included in the three-dimensional point cloud model rebuild, and pedestrian, automobile etc. The pixel of semantic information should be just removed.Therefore, the semantic information in data set can be divided into two classes --- it " protects Stay "=building, and road, pavement, lane line, etc. and " giving up "=pedestrian, and automobile, sky, bicycle, etc..Then, such as If one pixel of fruit belongs to " reservation " class, then z (q)=1 is enabled;If belonging to " giving up " class, z (q)=0 is enabled.In It is that can rewrite regular terms formula above are as follows:
Full convolutional neural networks model used in the present invention is from the beginning trained from the TensorFlow frame of full disclosure It obtains, network structure is shown in Fig. 2.Entire neural network framework is divided into two large divisions.One part is used to divide flat in picture Face, because the plane of outdoor scene occupies substantial portion of data during entire three-dimensional reconstruction, so independent It is calculated, to guarantee the accuracy of final result.It in addition to the activation primitive of prediction interval is wherein Softmax function, He all layers be all ReLU function.Another part is then the three-dimensional point cloud model for generating picture.This part is with before The identical abstract characteristic pattern of that partial sharing of face.It includes the convolutional layer of two stride-2 (3*3*512), then followed by The convolutional layer of the 1*1*3m of M plane parameter of one output, then uses an overall situation to be averaged pond.In addition to the last layer what Activation primitive need not, other are all ReLU functions.In final parameter designing, α=0.1, plane quantity M=5.Instruction When practicing model, Adam optimization algorithm, β can be used1=0.99, β2=0.9999, learning rate 0.0001, batch size It is 4.
Step 2 carries out estimation of Depth to each picture with full convolutional neural networks.
A series of continuous picture of outdoor scenes is shot with monocular camera, then using each picture as input, is used The trained full convolutional neural networks in front carry out estimation of Depth to it, obtain its three-dimensional point cloud model.
The threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm by step 3.
After the three-dimensional point cloud model for obtaining each picture, it is necessary to be fused into a point cloud model.Iteration is most Near point (Iterative Closest Point, lower abbreviation ICP) algorithm is a kind of point cloud matching algorithm, for solving 3D-3D's Pose estimation problem.The point cloud for two pictures for taking shooting time mutually to close on, since its shooting time is close, then its difference is not Greatly, their three-dimensional point cloud lap is very big, is more suitable for for matching and merging.
But before application ICP algorithm, need to solve the lap of two clouds.Since point cloud model is from coloured silk Estimation is got in chromatic graph piece, in order to guarantee the accuracy of overlaid pixel set calculating, directly calculates two color images here Overlapping region.The characteristic point for being extracted and being matched first two pictures using SIFT algorithm obtains matching point set Q and Q '.For this Transformation between two point sets, can be in the hope of its homography matrix H, i.e. Q '=HQ.Then four vertex for calculating registration figure are sat Mark, then can be carried out image registration, and then obtain the pixel set of overlapping region in two pictures, then pass through semantic information Only retain the pixel for wherein belonging to " reservation " classification, finally obtains pixel set N '={ 1 ..., n ' }.
For putting cloud known to two, wherein overlapping region can be expressed as:
P={ p1..., pn′P '={ p '1... p 'n′}
It, can will be on two point cloud matchings if finding the spin matrix R and translation matrix t of an European transformation, it may be assumed that
R and t can be solved using ICP algorithm, the present invention uses the solution mode of linear algebra, and purpose is exactly logical Crossing, which makes error sum of squares reach minimum, acquires R and t, i.e.,
Spin matrix R is calculated first, and the centroid position of two groups of point clouds, then calculate every group of point cloud midpoint goes center-of-mass coordinate qiWith q 'i:
qi=pi-p,q′i=p 'i-p
Define matrixIt is 3 × 3 matrixes, carries out SVD decomposition to W, obtains:
W=U ∑ VT
Then R is
R=UVT
It then can be with calculating translation matrix t
T=p-Rp '
It is translated, after rotation transformation, it will be under the point Cloud transform to P coordinate system in P ' with following formula:
Thereby realize the fusion of two cloudsThis operation is all taken to all point clouds, until leaving behind One three-dimensional point cloud model, so as to complete the three-dimensional reconstruction of entire outdoor scene.

Claims (2)

1. three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks, it is characterised in that including following step It is rapid:
Step 1, the full convolutional neural networks of training in the way of supervised learning;
Step 2 carries out estimation of Depth to each picture with full convolutional neural networks;
A series of continuous picture of outdoor scenes is shot with monocular camera, then using each picture as input, uses front Trained full convolutional neural networks carry out estimation of Depth to picture, obtain its three-dimensional point cloud model;
The threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm by step 3;
The step 1 is implemented as follows:
1-1. prepares a large amount of training pictures for training mesh parameter;
Each group of trained picture includes a common color image and the colour for the outdoor scene shooting to a certain angle The semantic segmentation information of picture corresponding depth picture and Pixel-level;Pass through the semantic segmentation of Pixel-level in SYNTHIA data set Information rejects redundant data;
1-2. carries out mathematical modeling to image data;WithIndicate the N group color image and depth picture in data set, And known camera internal reference matrix K;For color image IiIn any one pixel q, its homogeneous coordinates are [x, y, 1]T, T indicates transposition;Then it corresponding point Q is calculated with formula once in three dimensions:
Q=Di(q)·K-1Q formula 1
Assuming that the normal vector of a plane in three-dimensional space is Indicate the real vector of 1*3;It is each in order to make The normal vector of plane is all uniquely that n calculation is as follows:
The unit normal vector for indicating plane, is directed toward plane from origin;D indicates plane with a distance from origin;At fruit dot Q In some plane, then meet nTQ=1;
Assuming that color image IiIn have M plane, then to color image construct a pixels probability matrix Si;S thereini(q) it is The vector of one (M+1) dimension, its j-th of element are denoted asIt indicates that pixel q falls in the probability of j-th of plane, while using j =0 indicates non-planar;The plane parameter of the i-th picture can be obtained by minimizing following objective function
Wherein,For regular terms, network generates unessential result in order to prevent I.e. all pixels Point is all grouped into non-planar;α is then learning rate;When pixel q is projected to three-dimensional space from a picture, corresponding to it Due to perspective structure, one is scheduled on from a ray of q point in three-dimensional space;Remember the depth of the intersection point of ray and plane For λ, the three-dimensional coordinate of pixel q spatially is λ k-1q;So
For regular termsIt is calculated with following formula:
WhereinIndicate that pixel q falls probability in the plane, value range is in [0,1];
Semantic information in data set is divided into two classes: " reservation "={ building, road, pavement, lane line } and " giving up "= { pedestrian, automobile, sky, bicycle };If a pixel belongs to " reservation " class, z (q)=1 is enabled;If belonged to If " giving up " class, then z (q)=0 is enabled;Then regular terms formula above is rewritten are as follows:
Full convolutional neural networks are divided into two large divisions: a part is used to divide the plane in picture;Another part be then for Generate the three-dimensional point cloud model of picture;The identical abstract characteristic pattern of two partial sharings.
2. three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks according to claim 1, It is characterized in that the threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm described in step 3 have Body is accomplished by
3-1. solves the lap of two clouds;
The characteristic point for being extracted and being matched first two pictures using SIFT algorithm obtains matching point set Q and Q ';For the two point Transformation between collection acquires its homography matrix H, i.e. Q '=HQ;
Then four apex coordinates for calculating registration figure, then carry out image registration, and then obtain overlapping region in two pictures Pixel set, then the pixel for wherein belonging to " reservation " classification is only retained by semantic information, finally obtains pixel set N ' ={ 1 ..., n ' };
For putting cloud known to two, overlapping region can be expressed as:
P={ p1..., pn′P '={ p '1..., p 'n′Formula 9
3-2. finds the spin matrix R and translation matrix t of an European transformation, and two clouds are matched, it may be assumed that
R and t are solved using ICP algorithm, acquires R and t by making error sum of squares reach minimum, i.e.,
Spin matrix R, the centroid position of two groups of point clouds are calculated first;
Then calculate every group of point cloud midpoint removes center-of-mass coordinate qiWith q 'i:
qi=pi- P, q 'i=p 'i- p formula 13
Define matrixW is 3 × 3 matrixes, carries out SVD decomposition to W, obtains:
W=U ∑ VT
Then R is
R=UVT
Then translation matrix t is calculated
T=p-Rp ';
3-3. is translated, after rotation transformation, will be under the point Cloud transform to P coordinate system in P ' with following formula:
To realize the fusion of two cloudsThis operation is all taken to all point clouds, until leaving behind a three-dimensional Point cloud model, so as to complete the three-dimensional reconstruction of entire outdoor scene.
CN201910193450.XA 2019-03-14 2019-03-14 Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks Pending CN110060331A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910193450.XA CN110060331A (en) 2019-03-14 2019-03-14 Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910193450.XA CN110060331A (en) 2019-03-14 2019-03-14 Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks

Publications (1)

Publication Number Publication Date
CN110060331A true CN110060331A (en) 2019-07-26

Family

ID=67316063

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910193450.XA Pending CN110060331A (en) 2019-03-14 2019-03-14 Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks

Country Status (1)

Country Link
CN (1) CN110060331A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781937A (en) * 2019-10-16 2020-02-11 广州大学 Point cloud feature extraction method based on global visual angle
CN111340864A (en) * 2020-02-26 2020-06-26 浙江大华技术股份有限公司 Monocular estimation-based three-dimensional scene fusion method and device
CN111709976A (en) * 2020-08-24 2020-09-25 湖南国科智瞳科技有限公司 Rapid registration method and system for microscopic image and computer equipment
CN111918049A (en) * 2020-08-14 2020-11-10 广东申义实业投资有限公司 Three-dimensional imaging method and device, electronic equipment and storage medium
CN112085801A (en) * 2020-09-08 2020-12-15 清华大学苏州汽车研究院(吴江) Calibration method for three-dimensional point cloud and two-dimensional image fusion based on neural network
CN112381887A (en) * 2020-11-17 2021-02-19 广东电科院能源技术有限责任公司 Multi-depth camera calibration method, device, equipment and medium
CN113180832A (en) * 2021-04-21 2021-07-30 上海盼研机器人科技有限公司 Semi-surface short and small operation tractor positioning system based on mechanical arm
CN113674421A (en) * 2021-08-25 2021-11-19 北京百度网讯科技有限公司 3D target detection method, model training method, related device and electronic equipment
CN114937122A (en) * 2022-06-16 2022-08-23 黄冈强源电力设计有限公司 Rapid three-dimensional model reconstruction method for cement fiberboard house
CN116012564A (en) * 2023-01-17 2023-04-25 宁波艾腾湃智能科技有限公司 Equipment and method for intelligent fusion of three-dimensional model and live-action photo

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3349176A1 (en) * 2017-01-17 2018-07-18 Facebook, Inc. Three-dimensional scene reconstruction from set of two-dimensional images for consumption in virtual reality
CN109461180A (en) * 2018-09-25 2019-03-12 北京理工大学 A kind of method for reconstructing three-dimensional scene based on deep learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3349176A1 (en) * 2017-01-17 2018-07-18 Facebook, Inc. Three-dimensional scene reconstruction from set of two-dimensional images for consumption in virtual reality
CN109461180A (en) * 2018-09-25 2019-03-12 北京理工大学 A kind of method for reconstructing three-dimensional scene based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FENGTING YANG ET AL: "Recovering 3D Planes from a Single Image via Convolutional Neural Networks", 《RECOVERING 3D PLANES FROM A SINGLE IMAGE VIA CONVOLUTIONAL NEURAL NETWORKS》 *
陈英博: "Kinect点云数据与序列影像结合的三维重建技术", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781937A (en) * 2019-10-16 2020-02-11 广州大学 Point cloud feature extraction method based on global visual angle
CN110781937B (en) * 2019-10-16 2022-05-17 广州大学 Point cloud feature extraction method based on global visual angle
CN111340864B (en) * 2020-02-26 2023-12-12 浙江大华技术股份有限公司 Three-dimensional scene fusion method and device based on monocular estimation
CN111340864A (en) * 2020-02-26 2020-06-26 浙江大华技术股份有限公司 Monocular estimation-based three-dimensional scene fusion method and device
CN111918049A (en) * 2020-08-14 2020-11-10 广东申义实业投资有限公司 Three-dimensional imaging method and device, electronic equipment and storage medium
CN111918049B (en) * 2020-08-14 2022-09-06 广东申义实业投资有限公司 Three-dimensional imaging method and device, electronic equipment and storage medium
CN111709976A (en) * 2020-08-24 2020-09-25 湖南国科智瞳科技有限公司 Rapid registration method and system for microscopic image and computer equipment
CN112085801A (en) * 2020-09-08 2020-12-15 清华大学苏州汽车研究院(吴江) Calibration method for three-dimensional point cloud and two-dimensional image fusion based on neural network
CN112085801B (en) * 2020-09-08 2024-03-19 清华大学苏州汽车研究院(吴江) Calibration method for fusion of three-dimensional point cloud and two-dimensional image based on neural network
CN112381887A (en) * 2020-11-17 2021-02-19 广东电科院能源技术有限责任公司 Multi-depth camera calibration method, device, equipment and medium
CN112381887B (en) * 2020-11-17 2021-09-03 南方电网电力科技股份有限公司 Multi-depth camera calibration method, device, equipment and medium
CN113180832A (en) * 2021-04-21 2021-07-30 上海盼研机器人科技有限公司 Semi-surface short and small operation tractor positioning system based on mechanical arm
CN113674421A (en) * 2021-08-25 2021-11-19 北京百度网讯科技有限公司 3D target detection method, model training method, related device and electronic equipment
CN113674421B (en) * 2021-08-25 2023-10-13 北京百度网讯科技有限公司 3D target detection method, model training method, related device and electronic equipment
CN114937122A (en) * 2022-06-16 2022-08-23 黄冈强源电力设计有限公司 Rapid three-dimensional model reconstruction method for cement fiberboard house
CN116012564B (en) * 2023-01-17 2023-10-20 宁波艾腾湃智能科技有限公司 Equipment and method for intelligent fusion of three-dimensional model and live-action photo
CN116012564A (en) * 2023-01-17 2023-04-25 宁波艾腾湃智能科技有限公司 Equipment and method for intelligent fusion of three-dimensional model and live-action photo

Similar Documents

Publication Publication Date Title
CN110060331A (en) Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks
CN108596101B (en) Remote sensing image multi-target detection method based on convolutional neural network
CN110622213B (en) System and method for depth localization and segmentation using 3D semantic maps
CN107679537B (en) A kind of texture-free spatial target posture algorithm for estimating based on profile point ORB characteristic matching
CN112150575B (en) Scene data acquisition method, model training method and device and computer equipment
Vineet et al. Incremental dense semantic stereo fusion for large-scale semantic scene reconstruction
CN108898676B (en) Method and system for detecting collision and shielding between virtual and real objects
Tian et al. Depth estimation using a self-supervised network based on cross-layer feature fusion and the quadtree constraint
CN108665496A (en) A kind of semanteme end to end based on deep learning is instant to be positioned and builds drawing method
WO2022165809A1 (en) Method and apparatus for training deep learning model
CN107292965A (en) A kind of mutual occlusion processing method based on depth image data stream
WO2019239211A2 (en) System and method for generating simulated scenes from open map data for machine learning
CN106780592A (en) Kinect depth reconstruction algorithms based on camera motion and image light and shade
CN106803267A (en) Indoor scene three-dimensional rebuilding method based on Kinect
CN113256778B (en) Method, device, medium and server for generating vehicle appearance part identification sample
CN115272591B (en) Geographic entity polymorphic expression method based on three-dimensional semantic model
CN104537705A (en) Augmented reality based mobile platform three-dimensional biomolecule display system and method
Li et al. Three-dimensional traffic scenes simulation from road image sequences
CN116580161B (en) Building three-dimensional model construction method and system based on image and NeRF model
Hospach et al. Simulation of falling rain for robustness testing of video-based surround sensing systems
CN109727314A (en) A kind of fusion of augmented reality scene and its methods of exhibiting
Zhao et al. Autonomous driving simulation for unmanned vehicles
CN111599007B (en) Smart city CIM road mapping method based on unmanned aerial vehicle aerial photography
CN110378250A (en) Training method, device and the terminal device of neural network for scene cognition
CN114677479A (en) Natural landscape multi-view three-dimensional reconstruction method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190726