CN113345032B - Initialization map building method and system based on wide-angle camera large distortion map - Google Patents

Initialization map building method and system based on wide-angle camera large distortion map Download PDF

Info

Publication number
CN113345032B
CN113345032B CN202110767999.2A CN202110767999A CN113345032B CN 113345032 B CN113345032 B CN 113345032B CN 202110767999 A CN202110767999 A CN 202110767999A CN 113345032 B CN113345032 B CN 113345032B
Authority
CN
China
Prior art keywords
model
camera
map
module
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110767999.2A
Other languages
Chinese (zh)
Other versions
CN113345032A (en
Inventor
刘志励
范圣印
李一龙
王璀
张煜东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yihang Yuanzhi Technology Co Ltd
Original Assignee
Beijing Yihang Yuanzhi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yihang Yuanzhi Technology Co Ltd filed Critical Beijing Yihang Yuanzhi Technology Co Ltd
Priority to CN202110767999.2A priority Critical patent/CN113345032B/en
Publication of CN113345032A publication Critical patent/CN113345032A/en
Application granted granted Critical
Publication of CN113345032B publication Critical patent/CN113345032B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/80Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to an initialization map building method and system based on a wide-angle camera large distortion map, wherein the method comprises the following steps: s1, determining a specific camera model through tracking incident light rays; s2, extracting visual feature points of the distortion map, performing data association, and generating incident ray matching pairs; step S3, initializing a distortion map according to the geometrical relationship corresponding to the incident ray matching: s4, selecting a geometric model through error comparison; and S5, recovering the recursive scale. The method provides a general camera incident ray tracing scheme, so that any camera parameter model can be converted into the specific camera model provided by the application, and the direction of incident rays of corresponding pixel points is traced through the specific camera model, thereby completing the initialization of the visual map.

Description

Initialization map building method and system based on wide-angle camera large distortion map
Technical Field
The application relates to a camera modeling technology, a computer vision positioning technology and a vision three-dimensional reconstruction technology, in particular to an initialization map building method and system based on a wide-angle camera large-distortion map, and particularly relates to a method for realizing initialization map building of a robot or an automatic driving automobile based on the large-distortion wide-angle camera. The method can be used in the fields of automatic driving, full-automatic robots, unmanned aerial vehicles, virtual reality, augmented reality and the like.
Background
The visual initialization map is mainly applied to synchronous positioning map (SLAM) or three-dimensional reconstruction (SFM) technology, and the accuracy of the initialization map directly influences the effects of synchronous positioning map and three-dimensional map.
Chinese patents CN110458885A and CN110411457a are pose changes of the carrier measured by the pulses of the wheel encoder as one of constraints of the optimization equation, and the pose changes are added to the least squares optimization equation to recover the scale, and meanwhile, the reprojection error is used as a dependent variable of the objective function, so as to finally obtain the pose after the optimal recovery scale.
In the system initialization, the united states patent US10444761 uses the integral of the pulse of the wheel encoder measured between two frames of images or the integral of the acceleration and the angular velocity of the IMU to obtain the absolute pose change between the two frames of images, and simultaneously calculates the relative pose change without scale information through the visual geometrical relationship between the two frames of images, and restores the scale of the visual map through the ratio of the displacement calculated by the two sensors. In addition, the above patents only support pinhole camera models.
Compared with a common pinhole camera, the fisheye camera or the panoramic camera is large in visual angle, surrounding scene information acquired at the same time is more abundant, and the observed scene is more stable; but there are also more complex camera models, i.e. rays undergo more complex physical processes during imaging. The multi-view geometric theory and the optimization theory under the commonly used pinhole camera are not applicable to visual positioning and three-dimensional reconstruction of scenes on pictures generated by the large-distortion wide-angle lens. And a single camera builds a visual map with scale uncertainty that cannot be used for control of the actual carrier and obstacle measurement. Therefore, it is critical to solve the applicability problem of the commonly used multi-view geometry theory and optimization theory based on pinhole cameras in wide-angle large-distortion graphs of the observation model under the large-distortion cameras, and to recover the absolute scale of the visual map constructed by the single wide-angle camera.
Disclosure of Invention
The application aims to provide an initialization map building method based on a wide-angle camera large distortion map, which provides a general camera incident ray tracing scheme, so that any camera parameter model can be converted into a specific camera model provided by the application, and the direction of incident rays of corresponding pixel points is traced through the specific camera model to finish initialization of a visual map.
In order to achieve the above purpose, the application adopts the following technical scheme:
step S1, determining a specific camera model through tracking incident light rays: traversing all pixel points on an image, back-projecting each pixel point into a corresponding incident ray vector through a camera model calibrated in advance, and back-projecting the same pixel point into a corresponding incident ray vector through an initial specific camera model; constructing a least square error term between an incident ray vector obtained by back projection of a camera model calibrated in advance and an incident ray vector obtained by back projection of an initial specific camera model, adding error terms generated by all pixel points on an image, iteratively adjusting the specific camera model, and enabling the sum of the error terms to be minimum, thereby determining a final specific camera model;
s2, extracting visual feature points of the distortion map, and performing data association to generate incident ray matching pairs: extracting visual feature points on the distortion graph, and judging whether the image block is a valid matching point pair or not by calculating the distance between descriptors of the image block corresponding to the feature points; then, the specific camera model obtained in the step S1 is applied to each matching point pair, and the visual characteristic points are converted into the corresponding direction vectors of the incident light rays, so that the incident light ray matching pairs are generated;
step S3, initializing a distortion map according to the geometrical relationship corresponding to the incident ray matching: according to the incident light matching pairs obtained in the step S2, respectively calculating epipolar geometric relationships and homography geometric relationships corresponding to the incident light matching pairs, decomposing the epipolar matrixes and homography matrixes to obtain relative motions of two corresponding frames of images, and recovering points in a three-dimensional space corresponding to the incident light matching pairs under the condition of known relative motions; wherein, the homography geometric relationship is simply called as an H model, and the epipolar geometric relationship is simply called as an F model;
s4, selecting a geometric model through error comparison: according to the calculation result of the step S3, calculating the error of the relative motion of the two frames of images obtained by decomposing the epipolar matrix and the homography matrix respectively, and selecting the optimal relative motion calculation result through error comparison;
step S5, recovering the recursive scale: in the video frame sequence, inputting the pose with absolute scale of each frame of image through a wheel encoder and an inertial navigation system, aligning the optimal inter-frame relative motion calculation result obtained in the step S4 with the pose, and accumulating and adjusting the absolute scale of the three-dimensional map point through a plurality of frames to obtain the final coordinate with the absolute scale of the three-dimensional space point.
As a preferred embodiment of the present application, the initial specific camera model in step S1 is a specific camera model of a 4 th order polynomial camera, and any pixel point x= (u, v) and the direction vector r= (x, y, z) of the corresponding incident light ray in the image are calculated by a back projection method, which specifically includes the following steps:
(1) Firstly, carrying out two-dimensional plane affine transformation on image pixel coordinates
wherein ,uc and vc Respectively representing the center coordinates of the image in the transverse and longitudinal directions, A represents an affine transformation matrixThe initial value is usually set as the identity matrix during the calibration of a specific camera, i.e. +.>s represents a scaling factor, and the initial value is usually set to 1 in the calibration process of a specific camera;
(2) Calculating distance of affine transformed pixel coordinates relative to 4 th order polynomial camera center
(3) Calculating the component of the incident ray in the z-axis direction in the camera coordinate system
z′=a 0 +a 1 ρ+a 2 ρ 2 +a 3 ρ 3 +a 4 ρ 4
wherein ,a0 ,a 1 ,a 2 ,a 2 Coefficients representing a polynomial;
the direction vector of the incident light corresponding to the finally obtained image pixel point is
As a preferred aspect of the present application, in step S1, when each pixel point is back-projected into a corresponding incident light vector through a pre-calibrated camera model, the back-projection process of the pre-calibrated camera model may be represented in an abstract manner as a function f, and the incident light of the pixel point x is tracked as r' through back-projection of the pre-calibrated camera model, that is: r' =f (x)
The formula of the sum of least squares error terms constructed according to the incident light obtained by back projection of a pre-calibrated camera model and the incident light obtained by back projection of a specific camera model is as follows:
g=min gn (r-r′) 2
where g represents the back projection of the particular camera model that is ultimately used and n represents the number of pixels on the image.
As a preferred aspect of the present application, in step S2, when determining whether it is a valid matching point pair, the method is as follows:
(1) Acquiring two images which are generated in continuous time and are adjacent in time, respectively extracting characteristic points on each image, and carrying out characteristic point descriptor matching after extracting the characteristic points;
(2) For a certain descriptor in the first image, calculating the Hamming distance of the descriptor corresponding to each feature point in the second image, arranging the corresponding distances from small to large, and if the distance of the smallest descriptor is less than 60% of the distance of the next smallest descriptor; meanwhile, the minimum descriptor distance is smaller than 45, and then the feature points in the first image and the feature points in the second image corresponding to the minimum distance are judged to be effectively matched.
As a preferred aspect of the present application, when the incident light matching pair is a homography geometric relationship in step S3, the relative motion relationship of the two frames of images is expressed as follows:
wherein the coordinate of the feature point obtained at the first position of the camera is x 1 And x is 1 Matching the feature point observed at the second position of the camera to x 2 R represents the relative rotation of the two positions,representing the relative displacement of the two positions.
As a preferred aspect of the present application, when the incident light matching pair is a epipolar geometry in step S3, the relative motion relationship of the two frames of images is expressed as follows:
g(x 1 )*t×R*g(x 2 )=0
wherein the coordinate of the feature point obtained at the first position of the camera is x 1 And x is 1 Matching the feature point observed at the second position of the camera to x 2 R represents the relative rotation of the two positions,representing the relative displacement of the two positions.
Preferably, in step S3, when the point in the three-dimensional space corresponding to the matching pair of the incident light is restored, the matched feature point x 1 And x 2 The world coordinates X of points in the three-dimensional space corresponding thereto should satisfy the following relationship:
g(x 1 )×(P 1 X)=0
g(x 2 )×(P 2 X)=0
wherein ,P1 Projection matrix of camera in first position, P 1 =[I 0] ;P 2 For projection matrix of camera in second position, P 2 =[R t];
According to the above-mentioned combination, the value of the spatial position X of the same three-dimensional point observed in two continuous frames of large-distortion images can be calculated.
In the preferred embodiment of the present application, in step S4, when selecting the geometric model, the calculation method of the model error is as follows:
e sum =e 1 +e 2
wherein ,e1 Is x 1 E is the angle between the incident ray and the plane alpha 2 Is x 2 An included angle error of the corresponding polar plane beta; n is n α Is x 2 Normal vector, n to plane alpha when incident ray of (a) is coplanar with displacement vector t β Is x 1 Normal vector of plane beta formed by incident light ray and displacement vector t;
n α =t×R*g(x 2 )
n β =(-R -1 *t)×R -1 *g(x 1 )
if e F >e H Then select H model map initialization, if e H >e F Then the F model is selected for map initialization.
As a preferred aspect of the present application, in the recursive scale recovery in step S5, the absolute scale of the visual map is calculated by:
wherein ,expressed as mu j Rotation measured by a time wheel encoder or an inertial navigation system,/->Represented as mu i Displacement measured by a time wheel encoder or an inertial navigation system,/->Represented as mu j Displacement measured by a time wheel encoder or an inertial navigation system,/->Expressed as mu j Rotation of moment calculated from image information, +.>Represented as mu i Time of day displacement calculated by image information, +.>Represented as mu j Calculating the obtained displacement through the image information at the moment;
then, the true position X of the visual map point in the world coordinate system in the three-dimensional space is restored according to the absolute scale r
Wherein X is a coordinate value of the initial map point restored through step S3 in the world coordinate system.
The application further aims to provide an initialization mapping system based on a wide-angle camera large distortion map, which comprises a specific camera model building module, a characteristic point extraction module, a characteristic point matching module, a data correlation module, an incident ray correlation module, an H model analysis module, an F model analysis module, a distortion map initialization module, a model selection module, an absolute scale calculation module and an initialization mapping module;
the specific camera model construction module is used for determining a specific camera model used by the system according to a least square error term between an incident ray vector obtained by back projection of the constructed pre-calibrated camera model and an incident ray vector obtained by back projection of the initial specific camera model, and converting visual feature points into corresponding incident ray direction vectors according to the specific camera model;
the feature point extraction module is used for extracting visual feature points on the distortion graph;
the feature point matching module is used for carrying out feature point descriptor matching on the extracted feature points so that the extracted feature points all have corresponding descriptors;
the data association module is used for finding out feature points which can be effectively matched from two images which are generated in continuous time and are adjacent in time;
the incident ray association module is used for converting the corresponding relation of the feature points which are effectively matched into the corresponding relation of the incident rays at different moments in two images which are generated in continuous time and are adjacent in time;
the H model resolving module is used for calculating the relative motion relation of two continuous frames of images when the incident light matching pair is homography geometric relation;
the F model resolving module is used for calculating the relative motion relation of two continuous frames of images when the incident light is matched with the epipolar geometric relation;
the distortion map initializing module is used for resolving the value of the space position X of the same three-dimensional point observed in two continuous frames of large-distortion images according to the data output by the H model resolving module or the F model resolving module;
the model selection module is used for determining and selecting a model to be solved through error comparison;
the absolute scale calculation module is used for processing the data measured by the wheel encoder or the inertial navigation system and the data obtained by image information calculation to obtain the absolute scale of the visual map;
the initialization map building module is used for processing the X value obtained in the distortion map initialization module according to the absolute scale of the visual map to obtain the coordinate with the absolute scale of the final three-dimensional space point.
The application has the advantages and technical effects that:
1. the initialization mapping method provided by the application directly utilizes the distortion map adopted by the wide-angle camera, deduces the multi-view geometric theory under the pinhole camera model into a system for visual positioning and mapping of the large distortion map, and establishes a general incident light model (specific camera model), thereby reducing the calculation consumption generated in the process of full-map projection for correcting the distortion map, reducing the requirement of the system on the calculation performance of an operation platform, and being suitable for transplanting application among platforms.
2. The method provided by the application reserves texture information in the original distortion map, fully utilizes the advantage of larger visual field of the wide-angle camera, and adopts the information of the full map to obtain more durable and stable visual characteristic pixel blocks, so that the initialization is easier in the initialization map building, and a more accurate initialization map is obtained.
3. In the initialization map building process, the application provides an error calculation scheme based on the included angle between the incident light and the polar plane under the double-view geometry condition, and evaluates the stability and the accuracy of various double-view geometries based on the error calculation scheme, thereby ensuring the output of the optimal calculation result.
4. The initialization map construction method provided by the application adopts a recursive visual map scale recovery method, and the problem of uncertainty of the positioning and map construction scale of a single-camera is effectively solved by recursively calculating scale factors through a visual map and a vehicle-mounted odometer in a certain time.
5. The application combines a synchronous wheel type encoder or an inertial navigation system, recursively calculates and optimizes the absolute scale of a single wide-angle camera during the initialization of the visual map, and solves the problem of uncertain absolute scale when the visual map is built by using only the single wide-angle camera. In addition, the obtained visual map with absolute scale can be applied to robots or automatic driving automobile carriers for identifying obstacles in real space, planning paths and the like.
Drawings
FIG. 1 is a flow chart of the initialization mapping method of the present application;
FIG. 2 is a block diagram of an initialization mapping system of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In addition, embodiments of the present disclosure and features of the embodiments may be combined with each other without conflict. The technical aspects of the present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
As shown in fig. 1, the method for initializing and constructing a wide-angle camera large distortion map provided by the application comprises the following steps:
step S1, determining a specific camera model through tracking incident light rays: traversing all pixel points on an image, back-projecting each pixel point into a corresponding incident ray vector through a camera model calibrated in advance, and back-projecting the same pixel point into a corresponding incident ray vector through an initial specific camera model; constructing a least square error term between an incident ray vector obtained by back projection of a camera model calibrated in advance and an incident ray vector obtained by back projection of an initial specific camera model, adding error terms generated by all pixel points on an image, iteratively adjusting the specific camera model, and enabling the sum of the error terms to be minimum, thereby determining a final specific camera model;
s2, extracting visual feature points of the distortion map, and performing data association to generate incident ray matching pairs: extracting visual feature points on the distortion graph, and judging whether the image block is a valid matching point pair or not by calculating the distance between descriptors of the image block corresponding to the feature points; then, the specific camera model obtained in the step S1 is applied to each matching point pair, and the visual characteristic points are converted into the corresponding direction vectors of the incident light rays, so that the incident light ray matching pairs are generated;
step S3, initializing a distortion map according to the geometrical relationship corresponding to the incident ray matching: according to the incident light matching pairs obtained in the step S2, respectively calculating epipolar geometric relationships and homography geometric relationships corresponding to the incident light matching pairs, decomposing the epipolar matrixes and homography matrixes to obtain relative motions of two corresponding frames of images, and recovering points in a three-dimensional space corresponding to the incident light matching pairs under the condition of known relative motions; wherein, the homography geometric relationship is simply called as an H model, and the epipolar geometric relationship is simply called as an F model;
s4, selecting a geometric model through error comparison: according to the calculation result of the step S3, calculating the error of the relative motion of the two frames of images obtained by decomposing the epipolar matrix and the homography matrix respectively, and selecting the optimal relative motion calculation result through error comparison;
step S5, recovering the recursive scale: in the video frame sequence, inputting the pose with absolute scale of each frame of image through a wheel encoder and an inertial navigation system, aligning the optimal inter-frame relative motion calculation result obtained in the step S4 with the pose, and accumulating and adjusting the absolute scale of the three-dimensional map point through a plurality of frames to obtain the final coordinate with the absolute scale of the three-dimensional space point.
The following detailed descriptions are provided to make those skilled in the art clear how the above steps of the present application are implemented.
Step S1, determining a specific camera model by tracking incident light rays
(1) For any pixel point r= (u, v) in an image and the direction r= (x, y, z) of the incident light corresponding to the pixel point r= (u, v), the application uses a method of back projection of a specific camera model of a 4-order polynomial camera to calculate the direction of the incident light corresponding to the pixel point of the image initially, and the specific mode is as follows:
(1.1) first performing two-dimensional planar affine transformation on image coordinates
wherein ,uc and vc Respectively representing the center coordinates of the image in the transverse and longitudinal directions, A represents an affine transformation matrixUsually the initial value is set as the identity matrix during camera calibration, i.e. +.>s represents a scaling factor, and the initial value is typically set to 1 during calibration of a particular camera. That is, in the initial setting of camera calibration, it is generally considered that the projection of the incident light on the image plane and the direction in which the center of the image points to the pixel point coincide.
(1.2) calculating the distance ρ of the affine transformed pixel coordinates with respect to the 4 th order polynomial camera center
(1.3) calculating the component of the incident ray in the z-axis direction in the camera coordinate System
z′=a 0 +a 1 ρ+a 2 ρ 2 +a 3 ρ 3 +a 4 ρ 4 (3)
wherein ,a0 ,a 1 ,a 2 ,a 2 Coefficients representing a polynomial;
the direction vector of the incident light corresponding to the finally obtained image pixel point is
(2) When the back projection of the camera model calibrated in advance is the corresponding incident ray vector, the back projection process is expressed as a function f in the application, and the incident ray of the same pixel point x is tracked as r' by the back projection of the camera model calibrated in advance, namely:
r′=f(x) (5)
(3) According to the least square error term between the incident light obtained by back projection of the pre-calibrated camera model and the incident light obtained by back projection of the specific camera model, constructing the sum of the least square error terms by traversing all pixels on the image:
g=min gn (r-r′) 2 (6)
where g represents the back projection of the particular camera model that is ultimately used in the present application and n represents the number of pixels on the image.
S2, extracting visual feature points of the distortion map, and performing data association
Directly extracting visual feature points on a distortion chart obtained by a wide-angle camera, wherein the visual feature points comprise, but are not limited to, the existing SIFT, SURF, ORB feature points, performing feature point descriptor matching after extracting the feature points, for example, extracting image feature points in a system by using an ORB algorithm, and for the extracted ORB feature points, all the extracted ORB feature points have corresponding 256-bit brief binary descriptors; for two temporally adjacent pictures produced over a continuous time, t 1 Time-of-day generated image one, and t 2 Image two (t) generated at the moment 1 <t 2 ) The method comprises the steps of carrying out a first treatment on the surface of the For a certain descriptor in the first image, respectively calculating the Hamming distance of the brief descriptor corresponding to each feature point in the second image, arranging the corresponding distances from small to large, and if the distance of the smallest descriptor is less than 60% of the distance of the next smallest descriptor; meanwhile, if the minimum descriptor distance is smaller than 45, the feature points in the first image and the feature points in the second image corresponding to the minimum distance are considered as effective matching point pairs; and then, using the back projection of the specific camera model obtained in the step S1 to convert the visual feature points into the corresponding direction vectors of the incident light rays, and obtaining the matching relationship of the incident light rays according to the matching relationship of the two-dimensional plane feature points.
Step S3, initializing a map of the distortion map
(1) And solving the relative motion of two adjacent frames of images under different geometric relations through the matching of normal vectors of the unit incident light rays and the corresponding geometric relations satisfied between the unit incident light rays.
The corresponding geometrical relationships satisfied between the unit incident rays mainly include two types:
the first is homography geometric relationship, called H model for short, that is, the points in the physical space corresponding to the matched feature points are all on a plane, and the plane equation is that
wherein ,represents the normal vector of the plane and d represents the constant term of the plane equation.
At this time, the relative motion relationship of the adjacent two frames of images is expressed as follows:
wherein the coordinate of the feature point obtained at the first position of the camera is x 1 And x is 1 Matching the feature point observed at the second position of the camera to x 2 R represents the relative rotation of the two positions,representing the relative displacement of the two positions;
the second is the epipolar geometry of the unit incident ray, which is simply called F model, i.e. it is assumed that some or all points in the physical space corresponding to the matched feature points are not on the same plane;
at this time, the relative motion relationship of the adjacent two frames of images is expressed as follows:
g(x 1 )*t×R*g(x 2 )=0 (8)
wherein the coordinate of the feature point obtained at the first position of the camera is x 1 And x is 1 Matching the feature point observed at the second position of the camera to x 2 R represents the relative rotation of the two positions,representing the relative displacement of the two positions;
through the above relation, a rotation matrix R and a displacement vector t of the two cameras are obtained by combining an equation set formed by a plurality of matching point pairs.
(2) After the relative pose of two continuous positions of the wide-angle camera is calculated, the positions of three-dimensional space points observed by the two corresponding feature points are recovered through triangularization of the light rays of the unit incident light rays corresponding to the feature points.
The coordinate of the feature point obtained at the first position of the camera is x 1 And x is 1 Matching the feature point observed at the second position of the camera to x 2 R represents the relative rotation of the two positions,representing the relative displacement of the two positions; at the same time x 1 And x 2 The corresponding world coordinate of the point in the three-dimensional space is X, and the projection matrix at the moment, namely the projection matrix of the camera at the first position is
P 1 =[I 0] (9)
Wherein I is represented as an identity matrix, representing a 3*3, i.e. 3 rows and 3 columns identity matrix, and 0 is represented as 3*1, i.e. 3 rows and 1 columns zero matrix;
the projection matrix of the camera in the second position is that
P 2 =[R t] (10)
Matching the feature points of the two frames and the point X in the corresponding physical space satisfies the following equation
g(x 1 )×(P 1 X)=0 (11)
g(x 2 )×(P 2 X)=0 (12)
According to the above-mentioned combination, the value of the spatial position X of the same three-dimensional point observed in two continuous frames of large-distortion images is calculated and solved.
Step S4, geometric model selection
Rotation R and translation vector t and matching point pair x obtained through step S3 1 And x 2 And determining error calculation modes of the two geometric models at the point X in the corresponding three-dimensional space.
From the image, i.e. the three-dimensional space geometry, there is a plane α such that x 2 Is coplanar with the displacement vector t, the normal vector of plane α is:
n α =t×R*g(x 2 ) (13)
also present is a plane beta such that x 1 Is coplanar with the displacement vector t, the normal vector of plane β is:
n β =(-R -1 *t)×R -1 *g(x 1 ) (14)
since t is also in this plane, the optical centers of the two cameras are also in this plane, i.e. the plane equation is
From the above, x is 1 The included angle delta theta between the incident light ray and the plane alpha is as follows:
similarly, x 2 The included angle error of the polar plane beta corresponding to the angle error is as follows:
the error of the model is defined as:
e sum =e 1 +e 2 (18)
if e F >e H Selecting an H model map for initialization; if e H >e F And selecting the F model for map initialization.
Step S5, recursive scale recovery
The step is to recursively adjust the scale of a map frame by frame through multi-frame observation in a video sequence, and when the map point observed by the kth frame of the map is observed and the scale information of the map point is adjusted, the scale adjustment is required to be completed through observing the 1 st frame to the kth-1 st frame of the map point, and when the scale of the kth-1 st frame is adjusted, the scale of the map point is already adjusted through observing the 1 st frame to the kth-2 th frame of the map point. Each frame of image in the video sequence can be obtained by step S2 and step S3 Corresponding exercise information->T is the moment corresponding to the current image, K is the image serial number corresponding to the current image, i is the serial number of the image in the video sequence acquired by the camera, and the three-dimensional motion is formed according to the three-dimensional motion, namely
By integrating the data of the wheel encoder or the data of the inertial navigation system, the absolute odometer position O of the carrier at the moment μ of the corresponding image can be obtained μ Corresponding to the image serial numbers, obtaining images with different serial numbers corresponding to different time, wherein the odometer position of the carrier is
For i is more than or equal to 0 and less than or equal to K, j is more than or equal to 0 and less than i, and the absolute scale of the calculated visual map is as follows:
wherein ,expressed as mu j Rotation measured by a time wheel encoder or an inertial navigation system,/->Represented as mu i Displacement measured by a time wheel encoder or an inertial navigation system,/->Represented as mu j Displacement measured by a time wheel encoder or an inertial navigation system,/->Expressed as mu j Rotation of moment calculated from image information, +.>Represented as mu i Time of day displacement calculated by image information, +.>Represented as mu j Calculating the obtained displacement through the image information at the moment;
and recovering the true position X of the visual map point in the world coordinate system in the three-dimensional space r Is that
Wherein, X is the coordinate value of the initial map point recovered by the step three in the world coordinate system.
Embodiment 2 an initialization mapping system based on a wide-angle camera large distortion map
As shown in fig. 2, the initialization mapping system based on the wide-angle camera large distortion map provided by the application comprises a specific camera model building module 1, a feature point extraction module 2, a feature point matching module 3, a data association module 4, an incident ray association module 5, an H model analysis module 6, an F model analysis module 7, a distortion map initialization module 8, a model selection module 9, an absolute scale calculation module 10 and an initialization mapping module 11;
the specific camera model construction module 1 is configured to determine a specific camera model used by the system according to a least square error term between an incident ray vector obtained by back projection of a constructed pre-calibrated camera model and an incident ray vector obtained by back projection of an initial specific camera model, and convert a visual feature point into a corresponding incident ray direction vector according to the specific camera model;
a feature point extraction module 2, configured to extract visual feature points on the distortion map;
the feature point matching module 3 is used for performing feature point descriptor matching on the extracted feature points so that the extracted feature points have corresponding descriptors;
the data association module 4 is used for finding out feature points which can be effectively matched from two images which are generated in continuous time and are adjacent in time;
the incident ray association module 5 is used for converting the corresponding relation of the feature points which are effectively matched into the corresponding relation of the incident rays at different moments in two images which are generated in continuous time and are adjacent in time;
the H model resolving module 6 is used for calculating the relative motion relation of two continuous frames of images when the incident light matching pair is homography geometric relation;
the F model resolving module 7 is used for calculating the relative motion relation of two continuous frames of images when the incident light matching pair is the epipolar geometric relation;
the distortion map initializing module 8 is configured to solve a value of a spatial position X of the same three-dimensional point observed in two continuous frames of large-distortion images according to data output by the H model resolving module or the F model resolving module;
the model selection module 9 determines to select a model to be resolved through error comparison;
the absolute scale calculation module 10 is configured to process data measured by the wheel encoder or the inertial navigation system and data obtained by calculating image information to obtain an absolute scale of the visual map;
the initialization map building module 11 is configured to process the X value obtained in the distortion map initialization module according to the absolute scale of the visual map, so as to obtain the coordinate with the absolute scale of the final three-dimensional space point.
The initialization map building method and system provided by the application can be used in the fields of automatic driving, full-automatic robots, unmanned aerial vehicles, virtual reality, augmented reality and the like; when used for automatic driving, the method can realize the initialization mapping of the automatic driving automobile; when the method is used in the field of fully-automatic robots, the initialization mapping of the robots can be realized; when the method is used in the unmanned aerial vehicle field, the initialization map of the unmanned aerial vehicle can be realized, the method can obtain a more accurate initialization map, and the effect of subsequent map building is ensured.
In addition, it is necessary to explain: the map building method can finally obtain three-dimensional space point coordinates with absolute dimensions, and the obtained visual map with absolute dimensions can be applied to robots or automatic driving automobile carriers for identifying obstacles in real space, planning paths and the like; therefore, any field or application related to the technical scheme of the application belongs to the protection scope of the application.

Claims (10)

1. The initialization map construction method based on the wide-angle camera large distortion map is characterized by comprising the following steps of: the method comprises the following steps:
step S1, determining a specific camera model through tracking incident light rays: traversing all pixel points on an image, back-projecting each pixel point into a corresponding incident ray vector through a camera model calibrated in advance, and back-projecting the same pixel point into a corresponding incident ray vector through an initial specific camera model; constructing a least square error term between an incident ray vector obtained by back projection of a camera model calibrated in advance and an incident ray vector obtained by back projection of an initial specific camera model, adding error terms generated by all pixel points on an image, iteratively adjusting the specific camera model, and enabling the sum of the error terms to be minimum, thereby determining a final specific camera model;
s2, extracting visual feature points of the distortion map, and performing data association to generate incident ray matching pairs: extracting visual feature points on the distortion graph, and judging whether the image block is a valid matching point pair or not by calculating the distance between descriptors of the image block corresponding to the feature points; then, the specific camera model obtained in the step S1 is applied to each matching point pair, and the visual characteristic points are converted into the corresponding direction vectors of the incident light rays, so that the incident light ray matching pairs are generated;
step S3, initializing a distortion map according to the geometrical relationship corresponding to the incident ray matching: according to the incident light matching pairs obtained in the step S2, respectively calculating epipolar geometric relationships and homography geometric relationships corresponding to the incident light matching pairs, decomposing the epipolar matrixes and homography matrixes to obtain relative motions of two corresponding frames of images, and recovering points in a three-dimensional space corresponding to the incident light matching pairs under the condition of known relative motions; wherein, the homography geometric relationship is simply called as an H model, and the epipolar geometric relationship is simply called as an F model;
s4, selecting a geometric model through error comparison: according to the calculation result of the step S3, calculating the error of the relative motion of the two frames of images obtained by decomposing the epipolar matrix and the homography matrix respectively, and selecting the optimal relative motion calculation result through error comparison;
step S5, recovering the recursive scale: in the video frame sequence, inputting the pose with absolute scale of each frame of image through a wheel encoder and an inertial navigation system, aligning the optimal inter-frame relative motion calculation result obtained in the step S4 with the pose, and accumulating and adjusting the absolute scale of the three-dimensional map point through a plurality of frames to obtain the final coordinate with the absolute scale of the three-dimensional space point.
2. The initialization mapping method based on the wide-angle camera large distortion map of claim 1, wherein the method comprises the following steps: the initial specific camera model in step S1 is a specific camera model of a 4 th order polynomial camera, and any pixel x= (u, v) and the direction vector r= (x, y, z) of the incident light corresponding to the pixel x= (u, v) in the image are calculated by a back projection method, which specifically comprises the following steps:
(1) Firstly, carrying out two-dimensional plane affine transformation on image pixel coordinates
wherein ,uc and vc Respectively representing the center coordinates of the image in the transverse and longitudinal directions, A represents an affine transformation matrixThe initial value is usually set as the identity matrix during the calibration of a specific camera, i.e. +.>s represents a scaling factor, and the initial value is usually set to 1 in the calibration process of a specific camera;
(2) Calculating distance ρ of affine transformed pixel coordinates with respect to 4 th order polynomial camera center
(3) Calculating the component of the incident ray in the z-axis direction in the camera coordinate system
z′=a 0 +a 1 ρ+a 2 ρ 2 +a 3 ρ 3 +a 4 ρ 4
wherein ,a0 ,a 1 ,a 2 ,a 2 Coefficients representing a polynomial;
the direction vector of the incident light corresponding to the finally obtained image pixel point is
3. The initialization mapping method based on the wide-angle camera large distortion map of claim 2, wherein the method comprises the following steps: in step S1, when each pixel point is back projected into a corresponding incident light vector through a pre-calibrated camera model, the pre-calibrated camera model back projection process may be represented in an abstract manner as a function f, and the incident light of the tracking pixel point x through the pre-calibrated camera model back projection is r', i.e.: r' =f (x)
The formula of the sum of least squares error terms constructed according to the incident light obtained by back projection of a pre-calibrated camera model and the incident light obtained by back projection of a specific camera model is as follows:
g=min gn (r-r′) 2
where g represents the back projection of the particular camera model that is ultimately used and n represents the number of pixels on the image.
4. The initialization mapping method based on the wide-angle camera large distortion map of claim 1, wherein the method comprises the following steps: step S2, when judging whether the effective matching point pair exists, the adopted method is as follows:
(1) Acquiring two images which are generated in continuous time and are adjacent in time, respectively extracting characteristic points on each image, and carrying out characteristic point descriptor matching after extracting the characteristic points;
(2) For a certain descriptor in the first image, calculating the Hamming distance of the descriptor corresponding to each feature point in the second image, arranging the corresponding distances from small to large, and if the distance of the smallest descriptor is less than 60% of the distance of the next smallest descriptor; meanwhile, the minimum descriptor distance is smaller than 45, and then the feature points in the first image and the feature points in the second image corresponding to the minimum distance are judged to be effectively matched.
5. The initialization mapping method based on the wide-angle camera large distortion map according to claim 3, wherein the method comprises the following steps: in step S3, when the incident light matching pair is a homography geometric relationship, the relative motion relationship of the two frames of images is expressed as follows:
wherein the coordinate of the feature point obtained at the first position of the camera is x 1 And x is 1 Matching the feature point observed at the second position of the camera to x 2 R represents the relative rotation of the two positions,representing the relative displacement of the two positions.
6. The initialization mapping method based on the wide-angle camera large distortion map according to claim 3, wherein the method comprises the following steps: in step S3, when the incident light is matched to be a epipolar geometry, the relative motion relationship of the two frames of images is expressed as follows:
g(x 1 )*t×R*g(x 2 )=0
wherein the coordinate of the feature point obtained at the first position of the camera is x 1 And x is 1 Matching the feature point observed at the second position of the camera to x 2 R represents the relative rotation of the two positions,representing the relative displacement of the two positions.
7. The initialization mapping method based on the wide-angle camera large distortion map according to claim 3, wherein the method comprises the following steps: in step S3, in the three-dimensional space corresponding to the restored incident ray matching pairWhen in point, the matched characteristic point x 1 And x 2 The world coordinates X of points in the three-dimensional space corresponding thereto should satisfy the following relationship:
g(x 1 )×(P 1 X)=0
g(x 2 )×(P 2 X)=0
wherein ,P1 Projection matrix of camera in first position, P 1 =[I 0];P 2 For projection matrix of camera in second position, P 2 =[R t];
According to the above-mentioned combination, the value of the spatial position X of the same three-dimensional point observed in two continuous frames of large-distortion images can be calculated.
8. The initialization mapping method based on the wide-angle camera large distortion map according to claim 3, wherein the method comprises the following steps: in the step S4, when the geometric model is selected, the calculation method of the model error is as follows:
e sum =e 1 +e 2
wherein ,e1 Is x 1 E is the angle between the incident ray and the plane alpha 2 Is x 2 An included angle error of the corresponding polar plane beta; n is n α Is x 2 Normal vector, n to plane alpha when incident ray of (a) is coplanar with displacement vector t β Is x 1 Normal vector of plane beta formed by incident light ray and displacement vector t;
n α =t×R*g(x 2 )
n β =(-R -1 *t)×R -1 *g(x 1 )
if e F >e H Then select the H model mapInitializing, if e H >e F Then the F model is selected for map initialization.
9. The initialization mapping method based on the wide-angle camera large distortion map according to claim 3, wherein the method comprises the following steps: in the step S5, when the recursive scale is recovered, the absolute scale of the visual map is calculated by the following method:
wherein ,expressed as mu j Rotation measured by a time wheel encoder or an inertial navigation system,/->Represented as mu i Displacement measured by a time wheel encoder or an inertial navigation system,/->Represented as mu j Displacement measured by a time wheel encoder or an inertial navigation system,/->Expressed as mu j Rotation of moment calculated from image information, +.>Represented as mu i Time of day displacement calculated by image information, +.>Represented as mu j Calculating the obtained displacement through the image information at the moment;
thereafter according to the absoluteRestoring the true position X of a visual map point in three-dimensional space under a world coordinate system r
Wherein X is a coordinate value of the initial map point restored through step S3 in the world coordinate system.
10. The initialization map building system based on the wide-angle camera large distortion map is characterized in that: the system comprises a specific camera model construction module, a feature point extraction module, a feature point matching module, a data association module, an incident ray association module, an H model analysis module, an F model analysis module, a distortion map initializing module, a model selection module, an absolute scale calculation module and an initializing map construction module;
the specific camera model construction module is used for determining a specific camera model used by the system according to a least square error term between an incident ray vector obtained by back projection of the constructed pre-calibrated camera model and an incident ray vector obtained by back projection of the initial specific camera model, and converting visual feature points into corresponding incident ray direction vectors according to the specific camera model;
the feature point extraction module is used for extracting visual feature points on the distortion graph;
the feature point matching module is used for carrying out feature point descriptor matching on the extracted feature points so that the extracted feature points all have corresponding descriptors;
the data association module is used for finding out feature points which can be effectively matched from two images which are generated in continuous time and are adjacent in time;
the incident ray association module is used for converting the corresponding relation of the feature points which are effectively matched into the corresponding relation of the incident rays at different moments in two images which are generated in continuous time and are adjacent in time;
the H model resolving module is used for calculating the relative motion relation of two continuous frames of images when the incident light matching pair is homography geometric relation;
the F model resolving module is used for calculating the relative motion relation of two continuous frames of images when the incident light is matched with the epipolar geometric relation;
the distortion map initializing module is used for resolving the value of the space position X of the same three-dimensional point observed in two continuous frames of large-distortion images according to the data output by the H model resolving module or the F model resolving module;
the model selection module is used for determining and selecting a model to be solved through error comparison;
the absolute scale calculation module is used for processing the data measured by the wheel encoder or the inertial navigation system and the data obtained by image information calculation to obtain the absolute scale of the visual map;
the initialization map building module is used for processing the X value obtained in the distortion map initialization module according to the absolute scale of the visual map to obtain the coordinate with the absolute scale of the final three-dimensional space point.
CN202110767999.2A 2021-07-07 2021-07-07 Initialization map building method and system based on wide-angle camera large distortion map Active CN113345032B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110767999.2A CN113345032B (en) 2021-07-07 2021-07-07 Initialization map building method and system based on wide-angle camera large distortion map

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110767999.2A CN113345032B (en) 2021-07-07 2021-07-07 Initialization map building method and system based on wide-angle camera large distortion map

Publications (2)

Publication Number Publication Date
CN113345032A CN113345032A (en) 2021-09-03
CN113345032B true CN113345032B (en) 2023-09-15

Family

ID=77482901

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110767999.2A Active CN113345032B (en) 2021-07-07 2021-07-07 Initialization map building method and system based on wide-angle camera large distortion map

Country Status (1)

Country Link
CN (1) CN113345032B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113899357B (en) * 2021-09-29 2023-10-31 北京易航远智科技有限公司 Incremental mapping method and device for visual SLAM, robot and readable storage medium
CN113592865B (en) * 2021-09-29 2022-01-25 湖北亿咖通科技有限公司 Quality inspection method and equipment for three-dimensional map and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654484A (en) * 2015-12-30 2016-06-08 西北工业大学 Light field camera external parameter calibration device and method
CN109345471A (en) * 2018-09-07 2019-02-15 贵州宽凳智云科技有限公司北京分公司 High-precision map datum method is drawn based on the measurement of high-precision track data
CN110070615A (en) * 2019-04-12 2019-07-30 北京理工大学 A kind of panoramic vision SLAM method based on polyphaser collaboration
CN110108258A (en) * 2019-04-09 2019-08-09 南京航空航天大学 A kind of monocular vision odometer localization method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10572982B2 (en) * 2017-10-04 2020-02-25 Intel Corporation Method and system of image distortion correction for images captured by using a wide-angle lens
KR102367361B1 (en) * 2018-06-07 2022-02-23 우이시 테크놀로지스 (베이징) 리미티드. Location measurement and simultaneous mapping method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654484A (en) * 2015-12-30 2016-06-08 西北工业大学 Light field camera external parameter calibration device and method
CN109345471A (en) * 2018-09-07 2019-02-15 贵州宽凳智云科技有限公司北京分公司 High-precision map datum method is drawn based on the measurement of high-precision track data
CN110108258A (en) * 2019-04-09 2019-08-09 南京航空航天大学 A kind of monocular vision odometer localization method
CN110070615A (en) * 2019-04-12 2019-07-30 北京理工大学 A kind of panoramic vision SLAM method based on polyphaser collaboration

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
单目同时定位与建图中的地图恢复融合技术;张剑华;王燕燕;王曾媛;陈胜勇;管秋;;中国图象图形学报(第03期);全文 *

Also Published As

Publication number Publication date
CN113345032A (en) 2021-09-03

Similar Documents

Publication Publication Date Title
CN109166149B (en) Positioning and three-dimensional line frame structure reconstruction method and system integrating binocular camera and IMU
CN107564061B (en) Binocular vision mileage calculation method based on image gradient joint optimization
CN108717712B (en) Visual inertial navigation SLAM method based on ground plane hypothesis
CN111210463B (en) Virtual wide-view visual odometer method and system based on feature point auxiliary matching
Engel et al. Large-scale direct SLAM with stereo cameras
US9613420B2 (en) Method for locating a camera and for 3D reconstruction in a partially known environment
CN112304307A (en) Positioning method and device based on multi-sensor fusion and storage medium
Clipp et al. Robust 6dof motion estimation for non-overlapping, multi-camera systems
JP2019536170A (en) Virtually extended visual simultaneous localization and mapping system and method
CN113108771B (en) Movement pose estimation method based on closed-loop direct sparse visual odometer
KR101544021B1 (en) Apparatus and method for generating 3d map
Liu et al. Direct visual odometry for a fisheye-stereo camera
CN113345032B (en) Initialization map building method and system based on wide-angle camera large distortion map
CN104281148A (en) Mobile robot autonomous navigation method based on binocular stereoscopic vision
CN113223161B (en) Robust panoramic SLAM system and method based on IMU and wheel speed meter tight coupling
EP3185212B1 (en) Dynamic particle filter parameterization
CN113570662B (en) System and method for 3D localization of landmarks from real world images
CN114708293A (en) Robot motion estimation method based on deep learning point-line feature and IMU tight coupling
Huang et al. 360vo: Visual odometry using a single 360 camera
CN114234967A (en) Hexapod robot positioning method based on multi-sensor fusion
KR100574227B1 (en) Apparatus and method for separating object motion from camera motion
CN115147344A (en) Three-dimensional detection and tracking method for parts in augmented reality assisted automobile maintenance
CN113240597B (en) Three-dimensional software image stabilizing method based on visual inertial information fusion
CN112419411B (en) Realization method of vision odometer based on convolutional neural network and optical flow characteristics
CN113744308A (en) Pose optimization method, pose optimization device, electronic device, pose optimization medium, and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant