CN110060331A - Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks - Google Patents
Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks Download PDFInfo
- Publication number
- CN110060331A CN110060331A CN201910193450.XA CN201910193450A CN110060331A CN 110060331 A CN110060331 A CN 110060331A CN 201910193450 A CN201910193450 A CN 201910193450A CN 110060331 A CN110060331 A CN 110060331A
- Authority
- CN
- China
- Prior art keywords
- picture
- plane
- pixel
- convolutional neural
- neural networks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 25
- 238000000034 method Methods 0.000 title claims abstract description 16
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 11
- 239000011159 matrix material Substances 0.000 claims description 24
- 230000009466 transformation Effects 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 6
- 238000013519 translation Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000000354 decomposition reaction Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 235000013399 edible fruits Nutrition 0.000 claims description 2
- 230000000717 retained effect Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004438 eyesight Effects 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Geometry (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Computer Graphics (AREA)
- Mathematical Physics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses three-dimensional rebuilding methods outside a kind of monocular camera room based on full convolutional neural networks.The present invention is the following steps are included: step 1, the full convolutional neural networks of training in the way of supervised learning;Step 2 carries out estimation of Depth to each picture with full convolutional neural networks;A series of continuous picture of outdoor scenes is shot with monocular camera, then using each picture as input, estimation of Depth is carried out to picture with the trained full convolutional neural networks in front, obtains its three-dimensional point cloud model;The threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm by step 3.The present invention solves the problems, such as the three-dimensional reconstruction of monocular camera, and can realize on the hardware systems such as ordinary PC or work station through the invention.
Description
Technical field
The invention belongs to computer visions, computer graphics techniques field, and particularly, the present invention relates to one kind based on complete
Three-dimensional rebuilding method outside the monocular camera room of convolutional neural networks.
Background technique
Three-dimensional reconstruction is an important and basic problem in computer vision and computer graphics field, it is in agriculture
The fields such as industry, medical treatment, space flight, military affairs, environmental observation, landform exploration have a very wide range of applications.And one of those small point
Branch --- carrying out outdoor three-dimensional reconstruction to City scenarios can then play an important role in fields such as digital map navigation, urban plannings.?
It crosses the river after the three-dimensional map in city, people can easily check the sample in any one of city corner by various electronic equipments
Son, Google Maps are exactly a very successful example in this respect.By being combined with virtual reality and augmented reality, then
The functions such as integrated living information, e-commerce, virtual community and service, can bring people's experience of more immersion.Therefore,
The research of outdoor three-dimensional reconstruction has high scientific research and application value.
In field of Computer Graphics, the three-dimensional reconstruction of monocular camera is always one important and challenging ask
Topic.Although monocular camera cannot pass through range of triangle as binocular camera and depth camera or ToF, structure light principle are direct
The depth information of each pixel is obtained, but passes through long-run development, the technology relative maturity of monocular camera, cost is relatively low, knot
Structure is simple, to the of less demanding of computing resource, it is easier to be commercialized, such as manpower one smart phone standard configuration camera just
It is good monocular camera.Therefore, the method for the invention trained full convolutional Neural net in the way of by supervised learning
Network carries out estimation of Depth to each picture that monocular camera obtains, and is then fused into a complete threedimensional model, from
And complete three-dimensional reconstruction.
Summary of the invention
The present invention is intended to provide a kind of useful solution.It is an object of the invention to solve the three of monocular camera thus
Problems of Reconstruction is tieed up, input is the picture of multiple outdoor scenes shot by monocular camera, and the method in invention is individually right
Each picture carries out estimation of Depth, is finally fused into a complete threedimensional model.
The present invention propose the process of realization the following steps are included:
Step 1, the full convolutional neural networks of training in the way of supervised learning;
Step 2 carries out estimation of Depth to each picture with full convolutional neural networks;
A series of continuous picture of outdoor scenes is shot with monocular camera, then using each picture as input, is used
The trained full convolutional neural networks in front carry out estimation of Depth to picture, obtain its three-dimensional point cloud model;
The threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm by step 3;
The step 1 is implemented as follows:
1-1. prepares a large amount of training pictures for training mesh parameter;
Each group of trained picture includes a common color image of the outdoor scene shooting to a certain angle and is somebody's turn to do
The semantic segmentation information of color image corresponding depth picture and Pixel-level;Pass through the semanteme of Pixel-level in SYNTHIA data set
Segmentation information rejects redundant data;
1-2. carries out mathematical modeling to image data;WithIndicate the N group color image and depth in data set
Picture, and known camera internal reference matrix K;For color image IiIn any one pixel q, its homogeneous coordinates are
[x,y,1]T, T expression transposition;Then it corresponding point Q is calculated with formula once in three dimensions:
Q=Di(q)·K-1 qFormula 1
Assuming that the normal vector of a plane in three-dimensional space isIndicate the real vector of 1*3;In order to make
The normal vector of each plane is uniquely that n calculation is as follows:
The unit normal vector for indicating plane, is directed toward plane from origin;D indicates plane with a distance from origin;If
Point Q is in some plane, then is met
Assuming that color image IiIn have M plane, then to color image construct a pixels probability matrix Si;S thereini
It (q) is the vector of one (M+1) dimension, its j-th of element is denoted asIndicate that pixel q falls in the probability of j-th of plane,
Indicate non-planar with j=0 simultaneously;The plane parameter of the i-th picture can be obtained by minimizing following objective function
Wherein,For regular terms, network generates unessential result in order to preventI.e. all
Pixel is all grouped into non-planar;α is then learning rate;When pixel q is projected to three-dimensional space from a picture, its institute is right
Due to perspective structure, one is scheduled on from a ray of q point in the three-dimensional space answered;Remember the intersection point of ray and plane
Depth is λ, and the three-dimensional coordinate of pixel q spatially is λ k-1q;So
For regular termsIt is calculated with following formula:
WhereinIndicate that pixel q falls probability in the plane, value range [0,
1];
Semantic information in data set is divided into two classes: " reservation "={ building, road, pavement, lane line } and " house
Abandon "={ pedestrian, automobile, sky, bicycle };If a pixel belongs to " reservation " class, z (q)=1 is enabled;If
If belonging to " giving up " class, then z (q)=0 is enabled;Then regular terms formula above is rewritten are as follows:
Full convolutional neural networks are divided into two large divisions: a part is used to divide the plane in picture;Another part is then
For generating the three-dimensional point cloud model of picture;The identical abstract characteristic pattern of two partial sharings.
The threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm described in step 3, is had
Body is accomplished by
3-1. solves the lap of two clouds;
The characteristic point for being extracted and being matched first two pictures using SIFT algorithm obtains matching point set Q and Q ';For this two
Transformation between a point set acquires its homography matrix H, i.e. Q '=HQ;
Then four apex coordinates for calculating registration figure, then carry out image registration, and then obtain being overlapped in two pictures
The pixel set in region, then the pixel for wherein belonging to " reservation " classification is only retained by semantic information, finally obtain set of pixels
Conjunction N '=1 ..., n ' };
For putting cloud known to two, overlapping region can be expressed as:
P={ p1,...,pn′P '={ p '1,...,p′n′Formula 9
3-2. finds the spin matrix R and translation matrix t of an European transformation, and two clouds are matched, it may be assumed that
R and t are solved using ICP algorithm, acquires R and t by making error sum of squares reach minimum, i.e.,
Spin matrix R, the centroid position of two groups of point clouds are calculated first;
Then calculate every group of point cloud midpoint removes center-of-mass coordinate qi and q 'i:
qi=pi-p,q′i=p 'i- p formula 13
Define matrixW is 3 × 3 matrixes, carries out SVD decomposition to W, obtains:
W=U ∑ VT
Then R is
R=UVT
Then translation matrix t is calculated
T=p-Rp ';
3-3. is translated, after rotation transformation, will be under the point Cloud transform to P coordinate system in P ' with following formula:
To realize the fusion of two cloudsThis operation is all taken to all point clouds, until leaving behind one
Three-dimensional point cloud model, so as to complete the three-dimensional reconstruction of entire outdoor scene.
The features of the present invention and the utility model has the advantages that
The present invention realizes three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks, to Three-dimensional Gravity
Have greater significance.The present invention full convolutional neural networks of training in the way of supervised learning can directly carry out color image
Estimation of Depth obtains its three-dimensional point cloud model, then merges to all point cloud models, completes the Three-dimensional Gravity to outdoor scene
It builds.
Compared with binocular camera and depth camera, monocular camera passes through long-run development, and technology relative maturity, cost is relatively low,
Structure is simple, high the range of triangle unlike needed for binocular camera of the requirement to computing resource, it is easier to be commercialized.For example,
Almost the camera of all standard configurations is exactly monocular camera on manpower one smart phone now, and the imaging effect of camera is not
Mistake can directly bring use.
This technology can be realized on the hardware systems such as ordinary PC or work station.
Detailed description of the invention
Fig. 1 is the method for the present invention overview flow chart.
Fig. 2 is the model of full convolutional neural networks used in the present invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real
Applying mode, the present invention is described in further detail.
As shown in Figure 1, three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks, including following step
It is rapid:
Step 1, the full convolutional neural networks of training in the way of supervised learning:
It is the same with others neural network model, it is necessary first to prepare a large amount of training picture for training mesh parameter.
Each group of trained picture includes a common color image and the color image for the outdoor scene shooting to a certain angle
The semantic segmentation information of corresponding depth picture and Pixel-level.Due to data set artificially collect with label can spend a large amount of when
Between and energy, SYNTHIA data set can be used.Although data therein are all from virtual city, computer simulation
Effect and real world out has certain similitude.It should be noted that the initial purpose of this data set is for automatic
It drives, the mode of acquisition data is to simulate an automobile in true traffic behavior downward driving, at regular intervals from vehicle
One fixed position and angle shot photo.Almost the same picture is much organized so having in data set.Can by speed come
The data for rejecting redundancy, to avoid meaningless calculation amount.In addition to this, it is also necessary to reject unwanted part in picture.Figure
The information such as pedestrian, automobile in piece are needed not necessarily include in the threedimensional model rebuild, and road, building surface etc. information
Must then it retain.The step can be completed by the semantic segmentation information of Pixel-level in SYNTHIA data set, detailed process will
It is embodied in following regular terms.
Before introducing neural network model, need to carry out mathematical modeling to image data.WithIndicate data
The N group color image and depth picture of concentration, and known camera internal reference matrix K.For color image IiIn any one
Pixel q, its homogeneous coordinates are [x, y, 1]T, T expression transposition.Then its corresponding point Q formula once in three dimensions
It is calculated:
Q=Di(q)·K-1Q formula 1
Since data almost all of during three-dimensional reconstruction are all about plane information.Assuming that in three-dimensional space
The normal vector of a plane beIndicate the real vector of 1*3;Normal vector in order to make each plane is
Uniquely, n is calculated in the following way:
The unit normal vector for indicating plane, is directed toward plane from origin;D indicates plane with a distance from origin.If
Point Q is in some plane, then meets nTQ=1.
Assuming that color image IiIn have M plane, then to color image construct a pixels probability matrix Si.S thereini
It (q) is the vector of one (M+1) dimension, its j-th of element is denoted asIndicate that pixel q falls in the probability of j-th of plane, together
When with j=0 indicate non-planar.The plane parameter of the i-th picture can be obtained by minimizing following objective function
Wherein,For regular terms, network generates unessential result in order to preventI.e. all
Pixel is all grouped into non-planar;α is then learning rate.When pixel q is projected to three-dimensional space from a picture, its institute is right
Due to perspective structure, one is scheduled on from a ray of q point in the three-dimensional space answered.Remember the intersection point of ray and plane
Depth is λ, and the three-dimensional coordinate of pixel q spatially is λ k-1q.So
For regular termsIt can be calculated with following formula:
WhereinIndicate the probability that pixel q is fallen in plane (regardless of which plane),
Value range is in [0,1].It should be noted that not all pixel will participate in this when three-dimensional reconstruction
Process.Possess different semantic informations pixel logically whether need reconstructed probability be it is different, such as road,
The pixel of the semantic informations such as external wall should just be included in the three-dimensional point cloud model rebuild, and pedestrian, automobile etc.
The pixel of semantic information should be just removed.Therefore, the semantic information in data set can be divided into two classes --- it " protects
Stay "=building, and road, pavement, lane line, etc. and " giving up "=pedestrian, and automobile, sky, bicycle, etc..Then, such as
If one pixel of fruit belongs to " reservation " class, then z (q)=1 is enabled;If belonging to " giving up " class, z (q)=0 is enabled.In
It is that can rewrite regular terms formula above are as follows:
Full convolutional neural networks model used in the present invention is from the beginning trained from the TensorFlow frame of full disclosure
It obtains, network structure is shown in Fig. 2.Entire neural network framework is divided into two large divisions.One part is used to divide flat in picture
Face, because the plane of outdoor scene occupies substantial portion of data during entire three-dimensional reconstruction, so independent
It is calculated, to guarantee the accuracy of final result.It in addition to the activation primitive of prediction interval is wherein Softmax function,
He all layers be all ReLU function.Another part is then the three-dimensional point cloud model for generating picture.This part is with before
The identical abstract characteristic pattern of that partial sharing of face.It includes the convolutional layer of two stride-2 (3*3*512), then followed by
The convolutional layer of the 1*1*3m of M plane parameter of one output, then uses an overall situation to be averaged pond.In addition to the last layer what
Activation primitive need not, other are all ReLU functions.In final parameter designing, α=0.1, plane quantity M=5.Instruction
When practicing model, Adam optimization algorithm, β can be used1=0.99, β2=0.9999, learning rate 0.0001, batch size
It is 4.
Step 2 carries out estimation of Depth to each picture with full convolutional neural networks.
A series of continuous picture of outdoor scenes is shot with monocular camera, then using each picture as input, is used
The trained full convolutional neural networks in front carry out estimation of Depth to it, obtain its three-dimensional point cloud model.
The threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm by step 3.
After the three-dimensional point cloud model for obtaining each picture, it is necessary to be fused into a point cloud model.Iteration is most
Near point (Iterative Closest Point, lower abbreviation ICP) algorithm is a kind of point cloud matching algorithm, for solving 3D-3D's
Pose estimation problem.The point cloud for two pictures for taking shooting time mutually to close on, since its shooting time is close, then its difference is not
Greatly, their three-dimensional point cloud lap is very big, is more suitable for for matching and merging.
But before application ICP algorithm, need to solve the lap of two clouds.Since point cloud model is from coloured silk
Estimation is got in chromatic graph piece, in order to guarantee the accuracy of overlaid pixel set calculating, directly calculates two color images here
Overlapping region.The characteristic point for being extracted and being matched first two pictures using SIFT algorithm obtains matching point set Q and Q '.For this
Transformation between two point sets, can be in the hope of its homography matrix H, i.e. Q '=HQ.Then four vertex for calculating registration figure are sat
Mark, then can be carried out image registration, and then obtain the pixel set of overlapping region in two pictures, then pass through semantic information
Only retain the pixel for wherein belonging to " reservation " classification, finally obtains pixel set N '={ 1 ..., n ' }.
For putting cloud known to two, wherein overlapping region can be expressed as:
P={ p1..., pn′P '={ p '1... p 'n′}
It, can will be on two point cloud matchings if finding the spin matrix R and translation matrix t of an European transformation, it may be assumed that
R and t can be solved using ICP algorithm, the present invention uses the solution mode of linear algebra, and purpose is exactly logical
Crossing, which makes error sum of squares reach minimum, acquires R and t, i.e.,
Spin matrix R is calculated first, and the centroid position of two groups of point clouds, then calculate every group of point cloud midpoint goes center-of-mass coordinate
qiWith q 'i:
qi=pi-p,q′i=p 'i-p
Define matrixIt is 3 × 3 matrixes, carries out SVD decomposition to W, obtains:
W=U ∑ VT
Then R is
R=UVT
It then can be with calculating translation matrix t
T=p-Rp '
It is translated, after rotation transformation, it will be under the point Cloud transform to P coordinate system in P ' with following formula:
Thereby realize the fusion of two cloudsThis operation is all taken to all point clouds, until leaving behind
One three-dimensional point cloud model, so as to complete the three-dimensional reconstruction of entire outdoor scene.
Claims (2)
1. three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks, it is characterised in that including following step
It is rapid:
Step 1, the full convolutional neural networks of training in the way of supervised learning;
Step 2 carries out estimation of Depth to each picture with full convolutional neural networks;
A series of continuous picture of outdoor scenes is shot with monocular camera, then using each picture as input, uses front
Trained full convolutional neural networks carry out estimation of Depth to picture, obtain its three-dimensional point cloud model;
The threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm by step 3;
The step 1 is implemented as follows:
1-1. prepares a large amount of training pictures for training mesh parameter;
Each group of trained picture includes a common color image and the colour for the outdoor scene shooting to a certain angle
The semantic segmentation information of picture corresponding depth picture and Pixel-level;Pass through the semantic segmentation of Pixel-level in SYNTHIA data set
Information rejects redundant data;
1-2. carries out mathematical modeling to image data;WithIndicate the N group color image and depth picture in data set,
And known camera internal reference matrix K;For color image IiIn any one pixel q, its homogeneous coordinates are [x, y, 1]T,
T indicates transposition;Then it corresponding point Q is calculated with formula once in three dimensions:
Q=Di(q)·K-1Q formula 1
Assuming that the normal vector of a plane in three-dimensional space is Indicate the real vector of 1*3;It is each in order to make
The normal vector of plane is all uniquely that n calculation is as follows:
The unit normal vector for indicating plane, is directed toward plane from origin;D indicates plane with a distance from origin;At fruit dot Q
In some plane, then meet nTQ=1;
Assuming that color image IiIn have M plane, then to color image construct a pixels probability matrix Si;S thereini(q) it is
The vector of one (M+1) dimension, its j-th of element are denoted asIt indicates that pixel q falls in the probability of j-th of plane, while using j
=0 indicates non-planar;The plane parameter of the i-th picture can be obtained by minimizing following objective function
Wherein,For regular terms, network generates unessential result in order to prevent I.e. all pixels
Point is all grouped into non-planar;α is then learning rate;When pixel q is projected to three-dimensional space from a picture, corresponding to it
Due to perspective structure, one is scheduled on from a ray of q point in three-dimensional space;Remember the depth of the intersection point of ray and plane
For λ, the three-dimensional coordinate of pixel q spatially is λ k-1q;So
For regular termsIt is calculated with following formula:
WhereinIndicate that pixel q falls probability in the plane, value range is in [0,1];
Semantic information in data set is divided into two classes: " reservation "={ building, road, pavement, lane line } and " giving up "=
{ pedestrian, automobile, sky, bicycle };If a pixel belongs to " reservation " class, z (q)=1 is enabled;If belonged to
If " giving up " class, then z (q)=0 is enabled;Then regular terms formula above is rewritten are as follows:
Full convolutional neural networks are divided into two large divisions: a part is used to divide the plane in picture;Another part be then for
Generate the three-dimensional point cloud model of picture;The identical abstract characteristic pattern of two partial sharings.
2. three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks according to claim 1,
It is characterized in that the threedimensional model of each picture is fused into a complete threedimensional model with ICP algorithm described in step 3 have
Body is accomplished by
3-1. solves the lap of two clouds;
The characteristic point for being extracted and being matched first two pictures using SIFT algorithm obtains matching point set Q and Q ';For the two point
Transformation between collection acquires its homography matrix H, i.e. Q '=HQ;
Then four apex coordinates for calculating registration figure, then carry out image registration, and then obtain overlapping region in two pictures
Pixel set, then the pixel for wherein belonging to " reservation " classification is only retained by semantic information, finally obtains pixel set N '
={ 1 ..., n ' };
For putting cloud known to two, overlapping region can be expressed as:
P={ p1..., pn′P '={ p '1..., p 'n′Formula 9
3-2. finds the spin matrix R and translation matrix t of an European transformation, and two clouds are matched, it may be assumed that
R and t are solved using ICP algorithm, acquires R and t by making error sum of squares reach minimum, i.e.,
Spin matrix R, the centroid position of two groups of point clouds are calculated first;
Then calculate every group of point cloud midpoint removes center-of-mass coordinate qiWith q 'i:
qi=pi- P, q 'i=p 'i- p formula 13
Define matrixW is 3 × 3 matrixes, carries out SVD decomposition to W, obtains:
W=U ∑ VT
Then R is
R=UVT
Then translation matrix t is calculated
T=p-Rp ';
3-3. is translated, after rotation transformation, will be under the point Cloud transform to P coordinate system in P ' with following formula:
To realize the fusion of two cloudsThis operation is all taken to all point clouds, until leaving behind a three-dimensional
Point cloud model, so as to complete the three-dimensional reconstruction of entire outdoor scene.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910193450.XA CN110060331A (en) | 2019-03-14 | 2019-03-14 | Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910193450.XA CN110060331A (en) | 2019-03-14 | 2019-03-14 | Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110060331A true CN110060331A (en) | 2019-07-26 |
Family
ID=67316063
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910193450.XA Pending CN110060331A (en) | 2019-03-14 | 2019-03-14 | Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110060331A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110781937A (en) * | 2019-10-16 | 2020-02-11 | 广州大学 | Point cloud feature extraction method based on global visual angle |
CN111340864A (en) * | 2020-02-26 | 2020-06-26 | 浙江大华技术股份有限公司 | Monocular estimation-based three-dimensional scene fusion method and device |
CN111709976A (en) * | 2020-08-24 | 2020-09-25 | 湖南国科智瞳科技有限公司 | Rapid registration method and system for microscopic image and computer equipment |
CN111918049A (en) * | 2020-08-14 | 2020-11-10 | 广东申义实业投资有限公司 | Three-dimensional imaging method and device, electronic equipment and storage medium |
CN112085801A (en) * | 2020-09-08 | 2020-12-15 | 清华大学苏州汽车研究院(吴江) | Calibration method for three-dimensional point cloud and two-dimensional image fusion based on neural network |
CN112381887A (en) * | 2020-11-17 | 2021-02-19 | 广东电科院能源技术有限责任公司 | Multi-depth camera calibration method, device, equipment and medium |
CN113180832A (en) * | 2021-04-21 | 2021-07-30 | 上海盼研机器人科技有限公司 | Semi-surface short and small operation tractor positioning system based on mechanical arm |
CN113674421A (en) * | 2021-08-25 | 2021-11-19 | 北京百度网讯科技有限公司 | 3D target detection method, model training method, related device and electronic equipment |
CN114937122A (en) * | 2022-06-16 | 2022-08-23 | 黄冈强源电力设计有限公司 | Rapid three-dimensional model reconstruction method for cement fiberboard house |
CN116012564A (en) * | 2023-01-17 | 2023-04-25 | 宁波艾腾湃智能科技有限公司 | Equipment and method for intelligent fusion of three-dimensional model and live-action photo |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3349176A1 (en) * | 2017-01-17 | 2018-07-18 | Facebook, Inc. | Three-dimensional scene reconstruction from set of two-dimensional images for consumption in virtual reality |
CN109461180A (en) * | 2018-09-25 | 2019-03-12 | 北京理工大学 | A kind of method for reconstructing three-dimensional scene based on deep learning |
-
2019
- 2019-03-14 CN CN201910193450.XA patent/CN110060331A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3349176A1 (en) * | 2017-01-17 | 2018-07-18 | Facebook, Inc. | Three-dimensional scene reconstruction from set of two-dimensional images for consumption in virtual reality |
CN109461180A (en) * | 2018-09-25 | 2019-03-12 | 北京理工大学 | A kind of method for reconstructing three-dimensional scene based on deep learning |
Non-Patent Citations (2)
Title |
---|
FENGTING YANG ET AL: "Recovering 3D Planes from a Single Image via Convolutional Neural Networks", 《RECOVERING 3D PLANES FROM A SINGLE IMAGE VIA CONVOLUTIONAL NEURAL NETWORKS》 * |
陈英博: "Kinect点云数据与序列影像结合的三维重建技术", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110781937A (en) * | 2019-10-16 | 2020-02-11 | 广州大学 | Point cloud feature extraction method based on global visual angle |
CN110781937B (en) * | 2019-10-16 | 2022-05-17 | 广州大学 | Point cloud feature extraction method based on global visual angle |
CN111340864B (en) * | 2020-02-26 | 2023-12-12 | 浙江大华技术股份有限公司 | Three-dimensional scene fusion method and device based on monocular estimation |
CN111340864A (en) * | 2020-02-26 | 2020-06-26 | 浙江大华技术股份有限公司 | Monocular estimation-based three-dimensional scene fusion method and device |
CN111918049A (en) * | 2020-08-14 | 2020-11-10 | 广东申义实业投资有限公司 | Three-dimensional imaging method and device, electronic equipment and storage medium |
CN111918049B (en) * | 2020-08-14 | 2022-09-06 | 广东申义实业投资有限公司 | Three-dimensional imaging method and device, electronic equipment and storage medium |
CN111709976A (en) * | 2020-08-24 | 2020-09-25 | 湖南国科智瞳科技有限公司 | Rapid registration method and system for microscopic image and computer equipment |
CN112085801A (en) * | 2020-09-08 | 2020-12-15 | 清华大学苏州汽车研究院(吴江) | Calibration method for three-dimensional point cloud and two-dimensional image fusion based on neural network |
CN112085801B (en) * | 2020-09-08 | 2024-03-19 | 清华大学苏州汽车研究院(吴江) | Calibration method for fusion of three-dimensional point cloud and two-dimensional image based on neural network |
CN112381887A (en) * | 2020-11-17 | 2021-02-19 | 广东电科院能源技术有限责任公司 | Multi-depth camera calibration method, device, equipment and medium |
CN112381887B (en) * | 2020-11-17 | 2021-09-03 | 南方电网电力科技股份有限公司 | Multi-depth camera calibration method, device, equipment and medium |
CN113180832A (en) * | 2021-04-21 | 2021-07-30 | 上海盼研机器人科技有限公司 | Semi-surface short and small operation tractor positioning system based on mechanical arm |
CN113674421A (en) * | 2021-08-25 | 2021-11-19 | 北京百度网讯科技有限公司 | 3D target detection method, model training method, related device and electronic equipment |
CN113674421B (en) * | 2021-08-25 | 2023-10-13 | 北京百度网讯科技有限公司 | 3D target detection method, model training method, related device and electronic equipment |
CN114937122A (en) * | 2022-06-16 | 2022-08-23 | 黄冈强源电力设计有限公司 | Rapid three-dimensional model reconstruction method for cement fiberboard house |
CN116012564B (en) * | 2023-01-17 | 2023-10-20 | 宁波艾腾湃智能科技有限公司 | Equipment and method for intelligent fusion of three-dimensional model and live-action photo |
CN116012564A (en) * | 2023-01-17 | 2023-04-25 | 宁波艾腾湃智能科技有限公司 | Equipment and method for intelligent fusion of three-dimensional model and live-action photo |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110060331A (en) | Three-dimensional rebuilding method outside a kind of monocular camera room based on full convolutional neural networks | |
CN108596101B (en) | Remote sensing image multi-target detection method based on convolutional neural network | |
CN110622213B (en) | System and method for depth localization and segmentation using 3D semantic maps | |
CN107679537B (en) | A kind of texture-free spatial target posture algorithm for estimating based on profile point ORB characteristic matching | |
CN112150575B (en) | Scene data acquisition method, model training method and device and computer equipment | |
Vineet et al. | Incremental dense semantic stereo fusion for large-scale semantic scene reconstruction | |
CN108898676B (en) | Method and system for detecting collision and shielding between virtual and real objects | |
Tian et al. | Depth estimation using a self-supervised network based on cross-layer feature fusion and the quadtree constraint | |
CN108665496A (en) | A kind of semanteme end to end based on deep learning is instant to be positioned and builds drawing method | |
WO2022165809A1 (en) | Method and apparatus for training deep learning model | |
CN107292965A (en) | A kind of mutual occlusion processing method based on depth image data stream | |
WO2019239211A2 (en) | System and method for generating simulated scenes from open map data for machine learning | |
CN106780592A (en) | Kinect depth reconstruction algorithms based on camera motion and image light and shade | |
CN106803267A (en) | Indoor scene three-dimensional rebuilding method based on Kinect | |
CN113256778B (en) | Method, device, medium and server for generating vehicle appearance part identification sample | |
CN115272591B (en) | Geographic entity polymorphic expression method based on three-dimensional semantic model | |
CN104537705A (en) | Augmented reality based mobile platform three-dimensional biomolecule display system and method | |
Li et al. | Three-dimensional traffic scenes simulation from road image sequences | |
CN116580161B (en) | Building three-dimensional model construction method and system based on image and NeRF model | |
Hospach et al. | Simulation of falling rain for robustness testing of video-based surround sensing systems | |
CN109727314A (en) | A kind of fusion of augmented reality scene and its methods of exhibiting | |
Zhao et al. | Autonomous driving simulation for unmanned vehicles | |
CN111599007B (en) | Smart city CIM road mapping method based on unmanned aerial vehicle aerial photography | |
CN110378250A (en) | Training method, device and the terminal device of neural network for scene cognition | |
CN114677479A (en) | Natural landscape multi-view three-dimensional reconstruction method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190726 |