CN112862768B

CN112862768B - Adaptive monocular VIO (visual image analysis) initialization method based on point-line characteristics

Info

Publication number: CN112862768B
Application number: CN202110119124.1A
Authority: CN
Inventors: 范馨月; 宋子苑; 陶交
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2021-01-28
Filing date: 2021-01-28
Publication date: 2022-08-02
Anticipated expiration: 2041-01-28
Also published as: CN112862768A

Abstract

The invention relates to a point-line-characteristic-based adaptive monocular VIO (visual object orientation) initialization method, which belongs to the technical field of robot visual positioning navigation and comprises the following steps: s1: inputting image frames, respectively detecting point features and line features, inputting data acquired by an IMU, and performing pre-integration calculation between the image frames; s2: estimating an initial pose of the camera; s3: constructing a maximum posterior estimation problem, optimizing inertial parameters, and obtaining a scale factor, speed information, a gravity direction, and gyroscope bias and accelerometer bias of the IMU; s4: visual inertia alignment and scale scaling are carried out, and the initial pose of the camera is converted into a world coordinate system; s5: the initial values converge. The invention can complete more stable and accurate initialization under different complex environments and different initial states, solves the uncertainty of the sensor and the inconsistency of the inertial parameters in the VIO initialization process, and has higher performance.

Description

Adaptive monocular VIO (visual image analysis) initialization method based on point-line characteristics

Technical Field

The invention belongs to the technical field of robot visual positioning navigation, and relates to a self-adaptive monocular VIO (visual aid object) initialization method based on point-line characteristics.

Background

With the development of computer technology, research on the field of mobile robots has also been rapidly developed. In order to realize autonomous motion of the robot in an unknown environment, two problems to be solved are real-time estimation of the pose of the robot, how to construct a map according to the pose and further providing conditions for subsequent tasks such as autonomous positioning, path planning, obstacle avoidance and the like. In practical applications, a robot usually carries sensors with different functions, a SLAM system carrying a camera and an IMU is called a Visual Inertial SLAM, and an odometer is called a Visual Inertial Odometer (VIO), so that the robot has the advantages of small volume, low cost, strong scene recognition capability and the like, and has attracted extensive attention in the field.

For VIO, initializing the module is particularly important. The determination of initial parameters, such as gravity direction, velocity, IMU bias, etc., determines the accuracy of the system. Especially, scales cannot be directly observed in monocular VIO, so that the fusion of vision and inertia is difficult, and the initialization of VIO is difficult. For initialization of the IMU, since the accelerometer of the IMU data is affected by gravity, estimation of the gravity direction is also a decisive factor in pose estimation. If the initialization operation is in error, the accuracy of the whole system is reduced, and the optimization-based method may be trapped in a local optimum. The current initialization methods are mainly divided into tight coupling and loose coupling methods, and different solutions are proposed to the above-mentioned problems. The document "Martinelli et al, Closed-form solution of visual-inertial structure from motion. International Journal of Computer Vision, 2014" provides a Closed solution scheme for jointly acquiring parameters such as scale, gravity, bias, initial velocity, etc., based on which the camera pose can be roughly estimated from IMU data. The documents "Mur-Artal et al, Visual-inertial cellular SLAM with map reuse, IEEE robots and Automation Letters, 2017" and the document "T.Qin et al, VINS-Mono: A robust and versable monomeric Visual-inertial state estimator, IEEE Transactions on Robotics, vol.34, 2018" are based on the assumption that monocular cameras can accurately estimate a dimensionless camera trajectory, estimate inertial parameters by camera trajectory, and optimize by BA. The inertial parameters are solved by the least squares method in the linear equations provided by the visual information. However, the uncertainty of the sensor is ignored in the two initialization schemes, and the inertial parameters are solved in different steps respectively, so that the relevance is ignored.

In summary, the problems existing in the field of VIO technology are: 1) relying too much on scene features. In the existing VIO initialization algorithm, point features are generally adopted for pure visual estimation, but in a weak texture environment, such as a corridor, a wall and the like, a sufficient number of feature points are difficult to extract, so that initialization fails, and the positioning accuracy of the system is poor. 2) The correlation between the uncertainty of the sensor and the inertial parameters is not considered and the accelerometer bias of the IMU is typically ignored, resulting in less accurate estimates. 3) The requirement for the initial state is high, and the camera is required to provide enough rotation and translation in the initialization stage to complete initialization, so that the method is only applicable to specific situations.

Disclosure of Invention

In view of this, the present invention aims to solve the problems that the initial pose of a camera is difficult to estimate due to insufficient point features in a weak texture environment, the positioning accuracy of a system is poor, the uncertainty and relevance of a sensor are not considered in the inertial parameter estimation process, the applicability of an initialization scheme is not strong, and the like, and introduces line features as selectable items in a pure vision SFM, and provides robustness when the texture of a scene is insufficient to provide reliable estimation; meanwhile, a maximum posterior estimation problem is constructed to solve the inertia parameters, the consistency of the inertia parameters is guaranteed, the method is suitable for any initialization situation, and a self-adaptive monocular VIO initialization method based on point-line characteristics is provided.

In order to achieve the purpose, the invention provides the following technical scheme:

a self-adaptive monocular VIO initialization method based on dotted line characteristics comprises the following steps:

s1: inputting an image frame, and respectively detecting a point feature and a line feature; inputting data acquired by the IMU, and performing IMU pre-integration calculation between each image frame;

s2: estimating an initial pose of the camera; firstly, judging whether the point characteristics meet the requirements of parallax error conditions and quantity, if so, solving an essential matrix by an eight-point method, and estimating the initial pose of a camera; otherwise, introducing the line features, calculating the matched weak constraint scores, screening out the line features for initialization, and estimating the initial pose of the camera according to the point-line distance constraint;

s3: constructing a maximum posterior estimation problem, optimizing inertial parameters, and obtaining a scale factor, speed information, a gravity direction, and gyroscope bias and accelerometer bias of the IMU;

s4: visual inertia alignment, scaling and simultaneously converting the initial pose of the camera into a world coordinate system;

s5: the initial value converges and the initialization is completed.

Further, step S1 specifically includes: detecting point characteristics by using a Shi-Tomasi corner algorithm, wherein the algorithm is used for detecting based on gradient change and belongs to an improved algorithm for Harris corner detection; the line characteristics adopt an LSD (least squares distortion) line detection algorithm, and the core idea is to combine pixels with similar gradient directions and quickly detect straight line segments in an image; the IMU pre-integration means that all IMU measurement values between the kth frame and the (k + 1) th frame of an image are integrated to obtain PVQ values between the (k + 1) th frame, namely position, speed and rotation values, initial values are provided for vision, and the initial values are used as constraint terms of back-end optimization.

Further, in step S2, the camera initial pose is estimated according to whether the point feature satisfies the initialization condition as two cases:

case 1: the point characteristics meet the requirements of parallax error conditions and quantity, and the relation of corresponding points obtained according to epipolar geometry is as follows:

wherein x ₁ ＝(u ₁ ,v ₁ ,1) ^T 、x ₂ ＝(u ₂ ,v ₂ ,1) ^T Is the coordinate on the normalization plane of the corresponding pixel point, and R and t are the camera motion between two frames, representing rotation and translation, respectively; the middle part is marked as an intrinsic matrix E, expressed as E ^ t ^ R, and is a 3 multiplied by 3 matrix with 5 degrees of freedom;

the essential matrix is solved by an eight-point method, and is obtained by epipolar geometry:

in order to solve the E, eight pairs of matching points are needed to form eight equations, an essential matrix E is solved through Singular Value Decomposition (SVD), and a solution with positive depth is taken as final estimation;

case 2: introducing line features when the point features do not meet the initialization requirement, screening matched pairs by calculating weak constraint scores, and solving the initial pose of the camera by point-line distance constraint; wherein the weak constraint comprises a descriptor constraint and an epipolar constraint, and the fraction s is calculated for the descriptor constraint and the epipolar constraint respectively _d And s _e The method comprises the following steps:

the LSD line segment adopts an LBD descriptor, the pixel gradient is counted, and the average vector and the standard variance of the statistic are calculated to be used as the descriptor; for descriptor constraint, mainly considering that mismatching with larger appearance difference needs to be eliminated, calculating reference frame descriptor desc ₁ And a current frame descriptor desc ₂ Hamming distance therebetween, if less than the threshold τ _desc Descriptor score s _d Is marked as 1, if the value is larger than the threshold value, the descriptor score s _d Noted as 0, expressed as:

for epipolar constraint, since the line features have no strict epipolar constraint, reliability is enhanced as a weak constraint term; firstly, calculating epipolar lines of two end points of the line characteristics of a reference frame, wherein a straight line where the corresponding line characteristics AB of the current frame are located intersects the epipolar lines at a point C and a point D, and the constraint fraction is defined as:

wherein d is _min Representing the minimum Euclidean distance of four collinear points, d _max Represents the maximum Euclidean distance of the four collinear points;

finally, for each pair of match lines, the score s-s is calculated _d ·s _e If s is larger than a certain threshold value, the matching pair is considered to be available for initialization, and closed type solution is carried out;

the closed-form solution process is as follows: the end point projection of the 3D line feature should fall on the line observed by the camera theoretically to obtain the coefficient of the normalized line feature;

the inverse depths of the end points of the line marking characteristic are respectively rho _ks And ρ _ke Then the 3D line end reprojection is normalized to be:

where π (. cndot.) is the reprojection function, expressed as π (x, y, z) ^T ＝π(x/z,y/z,1) ^T ，R _i Based on a rotation matrix under the assumption of small rotation, that is, assuming that rotation between successive image frames is small, let r be (r) for a camera rotation vector and a translational vector, respectively ₁ ,r ₂ ,r ₃ ) ^T And t ═ t (t) ₁ ,t ₂ ,t ₃ ) ^T The rotation matrix is approximated by a first order Taylor expansion:

the distance between the projection point and the observation line is zero; taking the starting point as an example, the constraint is expressed as

Namely:

under the assumption of small rotation, ρ _ks t ₁ Negligible, so the above equation is simplified:

Ar ₁ +Br ₂ +Cr ₃ +D＝0

wherein:

in addition, the other end point

Also have the same constraints, so a pair of match lines yields two equations; if a plurality of pairs of matched lines exist, carrying out closed-type solution through the following linear equation, and obtaining a unique solution through SVD;

further, in step S3, constructing a maximum a posteriori estimation problem, optimizing IMU-related parameters, and obtaining a scale factor, velocity information, a gravity direction, and a gyroscope bias and an accelerometer bias of the IMU;

first, the estimated inertial parameters are:

where s is a scale factor, R _wg For gravity direction, the b vector includes IMU accelerometer bias b _a And gyroscope bias b _g ，

Is the speed of the 0 th frame to the k th frame of no scale; establishing a MAP problem containing prior by an IMU pre-integration theory;

wherein is

The likelihood values are such that,

is a value that is a priori known to the user,

representing a set of IMU pre-integrals between successive keyframes within an initialization window; assuming that the IMU measurements are independent each time, the MAP problem is described as:

and (3) assuming that errors of IMU pre-integration and prior distribution are Gaussian errors, and obtaining a final optimization problem:

wherein r is _p In order to be a priori the error,

pre-integrating the error for the IMU; and in the optimization process, the updating formula of the gravity direction and the scale factor is as follows:

s ^new ＝s ^old exp(δ _s )

the method considers the uncertainty of IMU, establishes the estimation of the inertial parameters as the optimal estimation problem, does not need to assume to ignore the bias of an accelerometer, and adds the known information as prior information into the MAP problem; all inertial parameters are estimated at one time, and the problem of data inconsistency is avoided.

Further, in step S4, after the inertial parameter optimization is completed, a scale information estimation value required by monocular vision is obtained, scaling is performed according to the scale to obtain a camera pose, a speed and a 3D map point, the camera pose, the speed and the 3D map point are aligned with the gravity direction, the pose is converted into a world coordinate system, and the IMU pre-integration is recalculated and updated; so far, visual and inertial parameters are respectively estimated, and finally BA optimization is carried out to obtain an optimal solution.

The invention has the beneficial effects that: 1) the invention improves the problems of low precision, poor robustness, insufficient applicability and the like of the traditional method, and can complete relatively stable and accurate initialization under different complex environments and different initial states; 2) the self-adaptive pure vision SFM estimation method of the drop line characteristics can be well adapted to the weak texture environment, provides structural information and improves the reliability; 3) the inertia optimization method based on the maximum posterior estimation can well solve the uncertainty of the sensor and the inconsistency of inertia parameters in the VIO initialization process. Simulation results show that the method has higher performance compared with the existing VIO algorithm.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.

Drawings

For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:

fig. 1 is a flowchart of an adaptive monocular VIO initialization algorithm based on dotted line characteristics according to an embodiment of the present invention;

FIG. 2 is a flow chart of line feature processing provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of the line feature epipolar constraint provided by an embodiment of the present invention;

FIG. 4 is a schematic diagram of feature extraction in a weak texture environment according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a comparison between the trajectory obtained by the method of the present invention and a real trajectory obtained by a conventional VIO method;

FIG. 6 is a graphical representation of the Root Mean Square Error (RMSE) obtained by the VIO process used in the present invention compared to a conventional VIO process.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.

Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.

Please refer to fig. 1 to 6, which illustrate an adaptive monocular VIO initialization method based on dotted line characteristics.

Fig. 1 is a flowchart of an adaptive monocular VIO initialization algorithm based on dotted line features according to an embodiment of the present invention, and as shown in the drawing, the adaptive monocular VIO initialization algorithm based on dotted line features according to an embodiment of the present invention includes:

the method comprises the steps of firstly detecting point characteristics through a Shi-Tomasi corner algorithm, detecting based on gradient change by the algorithm, and belonging to an improved algorithm of Harris corner detection. The line characteristics adopt an LSD (least squares-invariant feature) line detection algorithm, and the core idea is to combine pixels with similar gradient directions and quickly detect straight line segments in an image. The IMU pre-integration means that all IMUs between the kth frame and the (k + 1) th frame of an image are integrated, so that PVQ values between the (k + 1) th frame, namely position, speed and rotation values can be obtained, initial values are provided for vision, and the initial values are used as constraint terms of back-end optimization.

And then estimating the initial pose of the camera according to two conditions that whether the point characteristics meet the initialization conditions:

case 1: the point characteristics meet the requirements of parallax error conditions and quantity, and the relation of corresponding points can be obtained according to epipolar geometry as follows:

wherein x ₁ ＝(u ₁ ,v ₁ ,1) ^T 、x ₂ ＝(u ₂ ,v ₂ ,1) ^T Is the coordinate on the normalization plane of the corresponding pixel point, and R and t are twoCamera motion between frames represents rotation and translation, respectively. The middle part is denoted as the intrinsic matrix E, denoted as E ═ t ^ R, which is a 3 × 3 matrix with 5 degrees of freedom.

Solving the essential matrix by an eight-point method can obtain the essential matrix from epipolar geometry:

to solve E, eight pairs of matching points are needed to form eight equations, the intrinsic matrix E is solved by Singular Value Decomposition (SVD), and the solution with positive depth is taken as the final estimate.

Case 2: and (3) introducing line features when the point features do not meet the initialization condition, screening matched pairs by calculating weak constraint scores, and solving the initial pose of the camera by point-line distance constraint. Wherein the weak constraint comprises a descriptor constraint and an epipolar constraint, and the fraction s is calculated for the descriptor constraint and the epipolar constraint respectively _d And s _e The flow chart is shown in FIG. 2. The process is as follows:

the LSD line segment adopts LBD descriptors, and pixel gradients are counted, and the average vector and the standard deviation of the statistics are calculated to be used as the descriptors. For descriptor constraint, mainly considering that mismatching with larger appearance difference needs to be eliminated, calculating reference frame descriptor desc ₁ And a current frame descriptor desc ₂ Hamming distance therebetween, if less than the threshold τ _desc Descriptor score s _d Is marked as 1, if the value is larger than the threshold value, the descriptor score s _d Noted as 0, expressed as:

for epipolar constraints, reliability is enhanced as a weak constraint term since the line features do not have strict epipolar constraints. As shown in FIG. 3, which shows epipolar constraints at two endpoints of a line segment, the epipolar lines l at the two endpoints of the reference frame line feature are first calculated ₁ 、l ₂ And the straight line where the corresponding line feature AB of the current frame is located intersects with the epipolar line at a point C and a point D. The above-mentioned prescriptionThe bundle score is defined as:

wherein d is _min Representing the minimum Euclidean distance of four collinear points, d _max The maximum euclidean distance of the four points being collinear is shown.

Finally, for each pair of match lines, the score s-s is calculated _d ·s _e And if s is larger than a certain threshold value, the matching pair is considered to be available for initialization, and closed-form solution is carried out. The closed-form solution process is as follows: the end projection of the 3D line feature should theoretically fall on the line observed by the camera, so the coefficients of the normalized line feature can be obtained:

the inverse depths of the end points of the line marking characteristic are respectively rho _ks And ρ _ke Then the 3D line end reprojection can be expressed normalized as:

where π (. cndot.) is a reprojection function, which can be expressed as π (x, y, z) ^T ＝π(x/z,y/z,1) ^T ,R _i Based on a rotation matrix under the assumption of small rotation, that is, assuming that rotation between successive image frames is small, let r be (r) for a camera rotation vector and a translational vector, respectively ₁ ,r ₂ ,r ₃ ) ^T And t ═ t (t) ₁ ,t ₂ ,t ₃ ) ^T The rotation matrix can be approximated by a first order Taylor expansion:

since the projection point is on the observation line, the distance between the two is zero. Taking the starting point as an example, constrainCan be expressed as

Namely:

Ar ₁ +Br ₂ +Cr ₃ +D＝0

wherein:

in addition, the other end point

Also with the same constraints. A pair of match lines can thus yield two equations. If there are multiple pairs of match lines, then the unique solution is obtained from SVD by solving the following linear equation closed form:

for inertia estimation, a maximum posterior estimation problem is constructed, relevant parameters of the IMU are optimized, and a scale factor, speed information, a gravity direction, and a gyroscope bias and an accelerometer bias of the IMU are obtained. First, the estimated inertial parameters are:

Is the speed of the 0 th frame to the k th frame without scale. The MAP problem with prior can be established by IMU pre-integration theory:

wherein is

The value of the likelihood is used to determine,

is a value that is a priori known to the user,

representing the set of IMU pre-integrations between successive keyframes within the initialization window. Assuming that the IMU measurements are independent each time, the MAP problem can be described as:

assuming that the errors of IMU pre-integration and prior distribution are Gaussian errors, a final optimization problem can be obtained:

wherein r is _p In order to be a priori the error,

the error is pre-integrated for the IMU. And in the optimization process, the updating formula of the gravity direction and the scale factor is as follows:

s ^new ＝s ^old exp(δ _s )

the method considers the uncertainty of the IMU, establishes the estimation of the inertial parameters as an optimal estimation problem, does not need to assume to ignore the bias of the accelerometer, and adds the known information as prior information into the MAP problem. All inertial parameters can be estimated at one time, and the problem of data inconsistency is avoided.

After the inertial parameters are optimized, a scale information estimation value required by monocular vision can be obtained, scaling is carried out according to the scale, a camera pose, a camera speed and a 3D map point can be obtained, the camera pose, the camera speed and the 3D map point are aligned with the gravity direction, the pose is converted into a world coordinate system, and IMU pre-integration is recalculated and updated. So far, visual and inertial parameters are respectively estimated, and finally BA optimization is carried out to obtain an optimal solution.

Fig. 4 is a schematic diagram of feature extraction in a weak texture environment according to an embodiment of the present invention, and it can be seen from the diagram that when the environment texture is not obvious, it is difficult to extract point features, and at this time, the drop line features are used as structural information to solve this problem, so that the robustness of the VIO is enhanced.

The experiments were performed using the mainstream data set Euroc. The data set adopts a Micro Aerial Vehicle (MAV) to acquire image information and IMU information in an industrial environment, comprises 11 sequences in total, is divided into three types of simplicity, medium and difficulty according to illumination conditions, textures and movement speeds, and is suitable for testing performance of the invention.

Fig. 5 is a schematic diagram showing a comparison between the trajectory obtained by the VIO method adopted in the present invention and the real trajectory obtained by the conventional VIO method, wherein (a) is a schematic diagram showing the trajectory of the V2_01_ easy sequence, which has insufficient parallax and small translation in the initial stage; (b) is a trace schematic of the MH _05_ diffcult sequence, which remains almost stationary in the initial phase and is in a lightless, less textured environment for a long period of time. It can be seen that the trajectory obtained by the VIO method adopted by the invention is closer to the true value, thereby verifying that the method has better precision.

FIG. 6 is a graphical representation comparing the Root Mean Square Error (RMSE) obtained for the VIO process used in the present invention with the conventional VIO process, where (a) is the variation in Root Mean Square Error (RMSE) for the V2_01_ easy sequence and (b) is the variation in Root Mean Square Error (RMSE) for the MH _05_ difficult sequence. It can be seen that the Root Mean Square Error (RMSE) value obtained by the VIO method adopted by the invention is overall lower than that obtained by the traditional VIO method, and the variation amplitude is smaller, thereby verifying that the method has better stability.

Table 1 shows statistics of Translation error (Translation) and Rotation error (Rotation) of the vioc data set in the VIO method of the present invention and the conventional VIO algorithm, both using Root Mean Square Error (RMSE). As can be seen from the data in Table 1, the VIO method used in the present invention gave better results.

The monocular VIO initialization algorithm based on the point-line characteristics effectively solves the problems of low precision, poor robustness, insufficient applicability and the like of the traditional method, can complete relatively stable and accurate initialization under different complex environments and different initial states, introduces line characteristics, enables the initialization to be capable of carrying out pure visual estimation in a self-adaptive mode according to environmental changes, and enhances reliability. The uncertainty of the sensor and the inconsistency of the inertia parameters in the VIO initialization process are well solved; simulation results show that the method has certain improvement in initialization real-time performance, accuracy and stability, and has good performance. Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims

1. A self-adaptive monocular VIO initialization method based on point-line characteristics is characterized in that: the method comprises the following steps:

s2: estimating an initial pose of the camera; firstly, judging whether the point characteristics meet the requirements of parallax error conditions and quantity, if so, solving an essential matrix by an eight-point method, and estimating the initial pose of a camera; otherwise, introducing the line features, calculating the matched weak constraint scores, screening out the line features for initialization, and estimating the initial pose of the camera according to the point-line distance constraint; estimating the initial pose of the camera according to two conditions that whether the point characteristics meet the initialization conditions:

the essential matrix is solved by an eight-point method, which is obtained from the epipolar geometry:

in order to solve the E, eight equations are formed by eight pairs of matching points, an essential matrix E is solved by singular value decomposition SVD, and a solution with positive depth is taken as a final estimation;

case 2: introducing line features when the point features do not meet the initialization requirement, screening matched pairs by calculating weak constraint scores, and solving the initial pose of the camera by point-line distance constraint; wherein the weak constraint comprises a descriptor constraint and an epipolar constraint, and the fraction s is calculated for the descriptor constraint and the epipolar constraint respectively _d And s _e (ii) a The case 2 specifically includes the following steps:

s221: the LSD line segment adopts an LBD descriptor, the pixel gradient is counted, and the average vector and the standard variance of the statistic are calculated to be used as the descriptor; for descriptor constraint, mainly considering that mismatching with larger appearance difference needs to be eliminated, calculating reference frame descriptor desc ₁ And a current frame descriptor desc ₂ Hamming distance therebetween, if less than the threshold τ _desc Descriptor score s _d Marking as 1, if the value is larger than the threshold value, describing a sub-score s _d Noted as 0, expressed as:

s222: the epipolar constraint is used as a weak constraint item to enhance the reliability; firstly, calculating epipolar lines of two end points of the line characteristics of a reference frame, wherein a straight line where the corresponding line characteristics AB of the current frame are located intersects the epipolar lines at a point C and a point D, and the epipolar constraint fraction is defined as:

s223: match for each pairLine, calculating the fraction s ═ s _d ·s _e If s is larger than a preset threshold value, the match line pair is considered to be available for initialization, and closed-type solution is carried out;

s3: constructing a maximum posterior estimation problem, optimizing inertial parameters, and obtaining a scale factor, speed information, a gravity direction, and gyroscope bias and accelerometer bias of the IMU; step S3 specifically includes: constructing a maximum posterior estimation problem, and optimizing relevant parameters of the IMU to obtain a scale factor, speed information, a gravity direction, and gyroscope bias and accelerometer bias of the IMU;

first, the estimated inertial parameters are:

Is the speed of the 0 th frame to the k th frame of no scale; establishing a priori-contained MAP problem by IMU pre-integration theory:

wherein

Is the value of the likelihood that,

is a value that is a priori known to the user,

representing a set of IMU pre-integrals between successive keyframes within an initialization window; each measurement by the IMU is independent, and the MAP problem is described as：

The error of IMU pre-integration and prior distribution is Gaussian error, and the final optimization problem is as follows:

wherein r is _p In order to be a priori the error,

s ^new ＝s ^old exp(δ _s )

s5: the initial value converges and the initialization is completed.

2. The dotted line feature-based adaptive monocular VIO initialization method of claim 1, wherein: step S1 specifically includes: detecting point characteristics through a Shi-Tomasi corner algorithm; line characteristics adopt an LSD (least squares distortion) linear detection algorithm, pixels with similar gradient directions are combined, and linear segments in an image are rapidly detected; the IMU pre-integration means that all IMU measurement values between the kth frame and the (k + 1) th frame of an image are integrated to obtain PVQ values between the (k + 1) th frame, namely position, speed and rotation values, initial values are provided for vision, and the initial values are used as constraint terms of back-end optimization.

3. The dotted line feature-based adaptive monocular VIO initialization method of claim 1, wherein: the closed solving process in step S223 is as follows:

the end projection of the 3D line feature theoretically falls on the line observed by the camera, obtaining the coefficients of the normalized line feature:

where π (. cndot.) is the reprojection function, expressed as π (x, y, z) ^T ＝π(x/z,y/z,1) ^T ，R _i Based on a rotation matrix under the assumption of small rotation, that is, assuming that rotation between successive image frames is small, let r be (r) for a camera rotation vector and a translational vector, respectively ₁ ,r ₂ ,r ₃ ) ^T And t ═ t (t) ₁ ,t ₂ ,t ₃ ) ^T The rotation matrix is approximately expressed as a first order Taylor expansion:

Namely:

under the assumption of a small rotation, the rotation speed of the rotor,ρ _ks t ₁ neglected, so the above equation is simplified:

Ar ₁ +Br ₂ +Cr ₃ +D＝0

wherein:

in addition, the other end point

Also have the same constraints, so a pair of match lines yields two equations; if there are multiple pairs of match lines, then the unique solution is obtained from SVD by solving the following linear equation closed form:

4. the dotted line feature-based adaptive monocular VIO initialization method of claim 1, wherein: step S4 specifically includes: after the inertial parameters are optimized, obtaining a scale information estimation value required by monocular vision, carrying out scaling according to the scale to obtain a camera pose, a speed and a 3D map point, aligning the camera pose, the speed and the 3D map point with the gravity direction, converting the pose into a world coordinate system, and recalculating IMU pre-integration and updating; and finally, performing BA optimization to obtain an optimal solution.