CN114485637A - Visual and inertial mixed pose tracking method of head-mounted augmented reality system - Google Patents

Visual and inertial mixed pose tracking method of head-mounted augmented reality system Download PDF

Info

Publication number
CN114485637A
CN114485637A CN202210055655.3A CN202210055655A CN114485637A CN 114485637 A CN114485637 A CN 114485637A CN 202210055655 A CN202210055655 A CN 202210055655A CN 114485637 A CN114485637 A CN 114485637A
Authority
CN
China
Prior art keywords
pose
visual
augmented reality
inertial measurement
head
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210055655.3A
Other languages
Chinese (zh)
Inventor
何宁
晁建刚
许振瑛
胡福超
黄冉
佘佳宏
郭俊鹏
陈炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
63919 Troops of PLA
Original Assignee
63919 Troops of PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 63919 Troops of PLA filed Critical 63919 Troops of PLA
Priority to CN202210055655.3A priority Critical patent/CN114485637A/en
Publication of CN114485637A publication Critical patent/CN114485637A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/18Stabilised platforms, e.g. by gyroscope
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • G06F17/13Differential equations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/005Tree description, e.g. octree, quadtree
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Mathematical Physics (AREA)
  • Remote Sensing (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Automation & Control Theory (AREA)
  • Operations Research (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a visual and inertial mixed pose tracking method of a head-wearing augmented reality system, which comprises the steps of firstly carrying out equipment initialization, carrying out measurement preprocessing on a visual sensor and an inertial measurement device, processing frame images acquired by the visual sensor, simultaneously carrying out pre-integration on inertial measurement results corresponding to two continuous frame images, carrying out combined calibration on the visual sensor and the inertial measurement device, fusing the pre-integrated inertial measurement results, visual image characteristic observation results and loop detection results by utilizing measurement data acquired by the visual sensor and the inertial measurement device, carrying out fusion positioning on the head-wearing augmented reality system, carrying out loop detection on the positioning results, and carrying out global pose optimization by utilizing a repositioning result to obtain a final pose estimation result. The invention improves the intelligence and the usability of the augmented reality operation assistance of the astronaut and effectively improves the training efficiency of the astronaut.

Description

Visual and inertial mixed pose tracking method of head-mounted augmented reality system
Technical Field
The invention relates to the field of manned space flight, in particular to a visual and inertial mixed pose tracking method of a head-mounted augmented reality system.
Background
In the training process of the astronaut, the head movement of the astronaut is tracked, the sight direction is further obtained, and the method has very important significance for the evaluation of the operation behavior and the action of the astronaut. In the training process, in order to reappear an aerospace operation scene, an astronaut generally needs to wear the head-mounted augmented reality system, so that the position and pose of the head-mounted augmented reality system are estimated, the position and pose data of the head-mounted augmented reality system are obtained, the training process of an astronaut operation device can be effectively supported, and a basis is provided for the training effect evaluation of the astronaut operation device. In the existing real-time estimation method of the pose of the equipment, an optical method is easy to be shielded, the estimation effect is limited under the condition of poor light, and the accurate state of a system cannot be directly obtained due to the limitation of a sensor by a visual inertial system, so that the processes of initialization, external calibration, nonlinear optimization and the like of a system state estimator are difficult to realize in a static state. The inertial device IMU has higher output frequency and can output the pose state in a short period, but the pose estimation precision depends on the calibration process, and if the inertial device IMU cannot be effectively calibrated, the estimation precision is influenced.
Disclosure of Invention
The invention discloses a visual and inertial mixed pose tracking method of a head-mounted augmented reality system, aiming at the pose tracking problem of the head-mounted augmented reality system worn by astronauts in the training process.
The vision and inertia hybrid tracking method uses a vision sensor and an Inertia Measurement Unit (IMU) as a minimum sensor combination, and estimates the six-degree-of-freedom (DoF) state of a head-mounted augmented reality system by forming a vision inertia system. The visual inertial system cannot directly obtain the accurate state of the system due to the limitation of a sensor, so that the processes of initialization, external calibration, nonlinear optimization and the like of a system state estimator are difficult to realize in a static state. The six-degree-of-freedom state estimation of the head-mounted augmented reality system is realized by fusing visual feature observation data and pre-integral IMU measurement data, the system firstly starts the initialization and fault recovery process of an estimator, starts monocular VIO after the initialization is finished to obtain accurate visual inertia estimation, adopts a circular detection mode to quickly calculate repositioning, and uses a four-degree-of-freedom attitude diagram to optimize and improve the global consistency of the bitmap estimation. Finally, the system estimates the improvement in performance by comparison to a visual-only SLAM.
The invention discloses a visual and inertial mixed pose tracking method of a head-mounted augmented reality system, which comprises the steps of firstly carrying out equipment initialization, including measurement preprocessing on a visual sensor and an inertial measurement device, carrying out combined calibration on the visual sensor and the inertial measurement device, then carrying out fusion positioning on the head-mounted augmented reality system by using measurement data acquired by the visual sensor and the inertial measurement device, and carrying out loop detection and global pose graph optimization on a positioning result to obtain a final pose estimation result.
The measurement preprocessing process comprises processing frame images acquired by a vision sensor and performing pre-integration on inertial measurement results corresponding to two continuous frame images;
the method comprises the steps of carrying out combined calibration on a vision sensor and an inertia measurement device, calibrating the inertia measurement device to obtain an initial estimation result of a target attitude, a speed, a gravity vector, a gyroscope bias and a three-dimensional characteristic position, guiding a Visual-Inertial odometer VIO (VIO) based on nonlinear optimization, and calculating an absolute scale factor, the gyroscope bias, an acceleration bias, a gravity acceleration and a relative speed of the inertia measurement device to the vision sensor at each measurement moment.
The fusion positioning process fuses the inertia measurement result of the pre-integration and the visual image characteristic observation result, and realizes the fusion positioning of the head-wearing augmented reality system.
And the global pose graph is optimized, and the relocation result is utilized to carry out global optimization so as to eliminate drift.
In the equipment initialization process, firstly, a visual SFM algorithm based on a sliding window is initialized, then the pose change of the head-mounted augmented reality system is calculated, and the IMU pre-integration result is matched with the visual SFM algorithm result to restore the scale, gravity, speed and deviation so as to guide the VIO based on nonlinear optimization.
The visual SFM algorithm based on the sliding window is initialized, and comprises characteristic point tracking, camera pose estimation and optimization, for the characteristic point tracking, the characteristic corresponding relation between the latest frame image of the image collected by the head-mounted augmented reality system and all the frame images in the sliding window is checked, if the latest frame image and any frame image in the sliding window have more than a certain number of tracking characteristics and rotation compensation pixels, the latest frame image is subjected to relative rotation and scale translation by using the tracking characteristics and the rotation compensation pixels, all the characteristics in the two frames are triangulated, and if the latest frame image does not exist, the latest frame image is stored in the sliding window, and a new frame image is waited; for camera pose estimation, estimating the poses of all frame images in a sliding window by utilizing the triangulated features and adopting a PnP method to further obtain the camera poses; and the optimization is realized by minimizing the total weight projection error of all the feature observations by using a BA optimization method, so that the optimization estimation of the camera attitude is completed. P is the position coordinates of the object in the real space corresponding to the feature points of the left and right frame images respectively acquired by the left eye camera and the right eye camera, and is called as the feature position,characteristic point p1、p2The projection of the real object on the left eye camera and the right eye camera respectively, and O1 and O2 are the centers of the left eye camera and the right eye camera respectively. P is represented by the formula P ═ X, Y, Z]TAccording to the pinhole camera model, then the feature point p1、p2Satisfies the relationship:
s1p1=K1P
s2p2=K2(RP+t),
wherein, K1、K2Reference matrix of left eye and right eye cameras respectively, R, t rigid body conversion matrix and bias matrix between left eye and right eye cameras respectively, s1、s2The least square solution of P is calculated for the proportionality coefficients of the left eye camera and the right eye camera respectively as follows:
Figure BDA0003476374720000031
thereby obtaining a feature location estimation result. And according to the feature point matching information and the feature position estimation, acquiring the three-dimensional space position of the current feature point and the projection position of the current feature point on the camera by utilizing an SFM algorithm, and then estimating the image posture by directly utilizing a PnP algorithm.
The calibration of the inertial measurement equipment comprises the calibration of the bias of the gyroscope, the optimization of gravity vectors, and the calibration of speed, gravity vectors and scales. The gyroscope bias is calibrated by considering two continuous frame images b in the image windowkAnd bk+1And obtaining the corresponding rotation compensation pixel by using visual SFM algorithm
Figure BDA0003476374720000041
And
Figure BDA0003476374720000042
will frame image bk+1Time-to-frame image bkThe offset of the inertial measurement at that time is recorded as
Figure BDA0003476374720000043
Recording the result of the pre-integration of the corresponding inertial measurement results as
Figure BDA0003476374720000044
Linearizing a pre-integral term about the gyroscope bias in the inertial measurement equipment, and constructing a minimization loss function as follows:
Figure BDA0003476374720000045
Figure BDA0003476374720000046
wherein, B is the set of all frame image sequence numbers in the image window. Using biased jacobian matrices
Figure BDA0003476374720000047
To obtain
Figure BDA0003476374720000048
Solving the minimum loss function to obtain the gyroscope bias b by the first-order approximation of the gyroscope bias and delta as a calibration coefficientωInitial calibration of (1).
After the gyro is biased and initialized, the speed, the gravity vector and the actual scale are continuously calculated, and the quantity to be calculated is expressed as:
Figure BDA0003476374720000049
wherein,
Figure BDA00034763747200000410
is the velocity of the target in the body coordinate system when the image of the k-th frame is acquired,
Figure BDA00034763747200000411
is to obtain the c0Measured in time of frame imageS is a scaling factor that scales the visual SFM algorithm to metric units.
The combined calibration is carried out on the vision sensor and the inertia measuring equipment, CobjRepresenting an object coordinate system, and regarding the object coordinate system as a world coordinate system; ccRepresenting a vision sensor coordinate system; ciRepresenting the inertial measurement unit coordinate system. C can be obtained by a visual sensor calibration methodcAnd CobjRelative position of (2), here by RaAnd taRepresenting the pose and 3 translation basis components of the vision sensor relative to the world coordinate system, through CiAnd CcRelative position t of the middlexObtaining CiAnd CobjRelative position R ofx. The calibration of the vision sensor and inertial measurement unit is carried out by observing a known calibration reference in space at different positions to derive RxAnd txThe relationship of the results was observed several times.
The method is used for jointly calibrating a visual sensor and an inertial measurement unit, and needs to obtain multi-dimensional parameter information from an inertial measurement unit, and the implementation process comprises the following steps: firstly, directly acquiring 3D information of the three-dimensional object, namely directly acquiring a set of target attitude angles represented by three Euler angles from an inertial measurement device, and angular velocity and linear acceleration of raw data, so as to acquire required target attitude time txThen, assume the target initial position is zero, i.e. the initial position s00, target position s after time Deltat of initial velocity v1By a linear formula s1=vΔt+1/2·aΔt2And estimating, wherein a is target acceleration, and the multidimensional parameter information comprises a target attitude angle, an original data angular velocity, linear acceleration, target attitude time and a target position.
The method comprises the following steps of utilizing measurement data acquired by a vision sensor and inertial measurement equipment to perform fusion positioning on a head-mounted augmented reality system, wherein the fusion positioning comprises measurement preprocessing, vision and inertial data fusion, repositioning and global attitude map optimization.
The measurement preprocessing comprises visual front-end processing and IMU data pre-integration.
And (3) visual front-end processing, tracking the feature points of each frame of image through an LK sparse optical flow algorithm, simultaneously detecting and obtaining the corner feature points of each frame of image, keeping acquiring 300 features from each frame of image, and enabling the features in each frame of image to be uniformly distributed by setting the minimum distance of pixels between two adjacent features.
For the corner feature point, the detection is realized by using a second derivative matrix of the image gray intensity, wherein the second derivative matrix of the image gray intensity is expressed as:
Figure BDA0003476374720000051
wherein p represents a pixel point in the image, and I represents the gray scale intensity at the pixel point p. The method comprises the following steps of obtaining a second derivative image by calculating the second derivative of the gray intensity of all pixel points in a frame of image, and calculating the autocorrelation matrix of the second derivative image in a fixed window around each pixel point to obtain:
Figure BDA0003476374720000061
wherein, wi,jIs the normalized weight ratio of the pixels of the ith row and jth column within the fixed window to produce a circular window and a Gaussian weight, K is the number of pixels per row or per column in the fixed window, Ix(x, y) and Iy(x, y) represents the gray intensity of two pixels used to calculate the correlation at the x-th row and y-th column of the fixed window around it. And calculating the characteristic value of the autocorrelation matrix at each pixel point, and if the minimum characteristic value is still greater than the set minimum threshold value, taking the pixel point as the corner characteristic point.
Tracking the characteristic points of each frame of image through an LK sparse optical flow algorithm, and selecting a key frame image by adopting an average parallax standard or a tracking quality standard; in the average parallax criterion, if the average parallax of the tracking features between the current frame image and the latest key frame image exceeds 10 °, the current frame image is taken as a new key frame image, and the image rotation is compensated by using the short-term integral value of the measurement result of the inertial measurement device in calculating the average parallax. In the tracking quality criterion, if the number of tracking features is below a certain set value, the current frame image is taken as a new key frame image.
The IMU data pre-integration is characterized in that an IMU integration model is utilized, an IMU pre-integration algorithm is adopted, and an observed value of inertia measurement data, a covariance matrix of residual errors and a Jacobian matrix of the residual errors are calculated, wherein the observed value is used for calculating the residual errors, the Jacobian matrix of the residual errors is a gradient direction for optimizing data fusion, and the covariance matrix is a weight value corresponding to the observed value and is used for fusing the inertia measurement data and visual data.
The IMU integral model directly obtains the inertia data measured value in a body coordinate system, so that the measured value contains gravity information and the actual inertia motion information of the head-mounted augmented reality system and is subjected to acceleration bias baGyro offset bωAnd the effects of additive noise.
Combining with image frames bkAnd bk+1Corresponding to two time points, calculating the time interval t by inertia measurementk,tk+1]Position, speed and orientation states of the internal head-mounted augmented reality system in a world coordinate system.
The IMU pre-integration algorithm only compares the linear acceleration after changing the reference coordinate system from the world coordinate system to the body coordinate system
Figure BDA0003476374720000071
And angular velocity
Figure BDA0003476374720000072
The relevant variables are pre-integrated. The visual and inertial data fusion process described, using two successive frame images b in a sliding windowkAnd bk+1According to the IMU measurement model, the residual error of the pre-integrated inertial measurement data is calculated by using the accelerometer and gyroscope measurement values containing noise in the time interval between two continuous frames of images.
The visual and inertial data fusion process adopts a marginalization method to reduce the computational complexity of the VIO based on the nonlinear optimization, namely, the state quantity x of the inertial measurement data is selectively usedkAnd a characteristic quantity lambdalMarginalizing from the sliding window while converting quantities corresponding to the edge states into a priori data, so that spatially separated keyframes remain in the sliding window while providing sufficient disparity for feature triangulation. When the next new frame image is the key frame image, the frame is retained in the sliding window, and the corresponding measurement value of the earliest frame image is marginalized. If the next new frame image is a non-key frame image, only the inertial measurement data is retained and the visual measurement data is discarded, and the inertial measurement data is associated with this non-key frame image. The marginalization calculation is done with Schur compensation and a new a priori value is constructed with all marginalized measurement data related to the removed state and added to the existing a priori values.
In the repositioning process, loop detection is realized by using a bag-of-words model, the bag-of-words model is firstly established according to feature point data actually acquired in a space operating environment, when the position of a visual sensor changes, a new feature point set is obtained by real-time calculation, the feature point set is matched with a key frame feature description vector after forming a new description vector, and if the matching is successful, loop detection is realized. Recognizing the visited positions by adopting a bag-of-words model, then establishing a characteristic level connection between a candidate closed loop and a current frame image, and integrating a characteristic pair into a visual-inertial odometer VIO based on nonlinear optimization to obtain a drift-free target state estimation. By matching a feature point set obtained at a certain position with a bag of words, features at the position are described as
A=a1w1+a2w2+a3w3+...+anwn
Wherein wiDenotes the i-th feature vector in the bag of words, aiIndicates the frequency of occurrence of the feature, [ a ]1,a2,...,an]Is just a descriptionVectors, using the L1 norm versus the distance between description vectors, i.e.
Figure BDA0003476374720000081
Wherein a and b are description vectors to be compared, and W is the dimension of the description vectors.
The bag-of-words model is constructed by adopting a clustering method, and the bag-of-words is expressed by using a k-ary tree. The method comprises the steps of utilizing feature point data obtained by actually collecting all features obtained in a preliminary test, and randomly selecting a plurality of feature point data from the feature point data to serve as samples for clustering. The clustering method is realized by adopting a K-Mean algorithm, the clustering number K is determined, K D-dimensional feature point data vectors are randomly selected as initial class centers, and D is the dimension of the feature point data; calculating the distance between each feature point data vector and a clustering center, and selecting the closest clustering center as the class to which the feature point data vector belongs; within the same class, the cluster center is recalculated, and the calculation and classification processes are iterated until the sample data is converged. Loop back detection is implemented using the TF-IDF algorithm.
When a loop is detected, the feature descriptors of BRIEF are matched, feature consistency is checked, and connection between a local sliding window and loop candidates is established. Removing outliers by adopting a geometric outlier removing method, performing a basic matrix test by using a RANSAC algorithm, and performing the basic matrix test by using 2D feature retrieval in a current image and a candidate loop; PnP position detection is performed through a RANSAC algorithm, and PnP position detection is performed by using a 3D position with known characteristics in a local sliding window and a 2D projection in a candidate loop image.
The global pose graph is optimized, and after relocation, a local sliding window is moved and aligned with a past position. With the relocation result, registration of the past set of poses into a globally consistent configuration is ensured by global pose graph optimization. When the key frame is rimmed from the sliding window, adding the rimmed key frame into the pose graph, and taking the rimmed key frame as a new vertex in the pose graph, and connecting the new vertex with other vertexes in the pose graph through a sequence edge and/or a loop edge: the newly rimmed key frames will establish several sequence edges to associate themselves with the existing key frames, the sequence edges representing the relative transition between two key frames in the local sliding window, the values of which are taken directly from the VIO.
If the newly rimmed key frame has a loop connected with the existing key frame, the loop edge in the pose graph is connected with the loop frame, the loop edge only contains 4-degree-of-freedom relative pose transformation, and the value of the loop edge is obtained by using the repositioning result.
The beneficial effects of the invention include: the invention can carry out accurate pose estimation on the head-mounted augmented reality system, obtain the pose data thereof, effectively support the training process of the operating equipment of the astronaut and provide a basis for the evaluation of the training effect thereof. The invention can effectively improve the intelligence and the usability of the augmented reality operation assistance of the astronaut and effectively improve the training efficiency of the astronaut.
Drawings
FIG. 1 is a flow chart of an implementation of the method of the present invention;
FIG. 2 is a schematic diagram of the visual inertial alignment process for estimator initialization of the present invention;
FIG. 3 is a schematic view of the epipolar geometry constraint of the present invention;
FIG. 4 is a schematic diagram of the gravity 2 degree-of-freedom parameterization of the present invention;
FIG. 5 is a schematic diagram of relative positions of coordinate systems before and after movement according to the present invention;
FIG. 6 is a schematic view of two movements of the vision sensor and inertial measurement unit platform of the present invention;
FIG. 7 is a graph of visual residuals on a unit sphere according to the present invention;
FIG. 8 is a schematic diagram of the marginalization strategy of the present invention;
FIG. 9 is a schematic view of a sliding window monocular VIO relocation of the present invention;
FIG. 10 is a schematic view of a 5-level 10-pronged tree descriptor bag of the present invention;
FIG. 11 is a schematic diagram of the repositioning and pose graph optimization process of the present invention.
Detailed Description
For a better understanding of the present disclosure, two examples are given herein.
FIG. 2 is a flow chart of an implementation of the method of the present invention; FIG. 2 is a schematic diagram of the visual inertial alignment process for estimator initialization of the present invention; FIG. 3 is a schematic view of the epipolar geometry constraint of the present invention; FIG. 4 is a schematic diagram of the gravity 2 degree-of-freedom parameterization of the present invention; FIG. 5 is a schematic diagram of relative positions of coordinate systems before and after movement according to the present invention; FIG. 6 is a schematic view of two movements of the vision sensor and inertial measurement unit platform of the present invention; FIG. 7 is a graph of the visual residual on a unit sphere of the present invention; FIG. 8 is a schematic diagram of the marginalization strategy of the present invention; FIG. 9 is a schematic view of a sliding window monocular VIO relocation of the present invention; FIG. 10 is a schematic view of a 5-level 10-pronged tree descriptor bag of the present invention; FIG. 11 is a schematic diagram of the repositioning and pose graph optimization process of the present invention.
The first embodiment is as follows:
the invention discloses a visual and inertial mixed pose tracking method of a head-mounted augmented reality system, which comprises the steps of firstly carrying out equipment initialization, including measurement preprocessing on a visual sensor and an inertial measurement device, carrying out combined calibration on the visual sensor and the inertial measurement device, then carrying out fusion positioning on the head-mounted augmented reality system by using measurement data acquired by the visual sensor and the inertial measurement device, and carrying out loop detection and global pose graph optimization on a positioning result to obtain a final pose estimation result.
The measurement preprocessing process comprises processing frame images acquired by a visual sensor and simultaneously performing pre-integration on inertial measurement results corresponding to two continuous frame images;
the method comprises the steps of carrying out combined calibration on a vision sensor and an inertia measurement device, calibrating the inertia measurement device to obtain an initial estimation result of a target attitude, a speed, a gravity vector, a gyroscope bias and a three-dimensional characteristic position, guiding a Visual-Inertial odometer VIO (VIO) based on nonlinear optimization, and calculating an absolute scale factor, the gyroscope bias, an acceleration bias, a gravity acceleration and a relative speed of the inertia measurement device to the vision sensor at each measurement moment.
The fusion positioning process fuses the inertia measurement result of the pre-integration, the visual image feature observation result and the loop detection result, and realizes the fusion positioning of the head-wearing augmented reality system.
And the global pose graph is optimized, and the relocation result is utilized to carry out global optimization so as to eliminate drift.
In the equipment initialization process, firstly, a visual SFM algorithm based on a sliding window is initialized, then the pose change of the head-mounted augmented reality system is calculated, and the IMU pre-integration result is matched with the visual SFM algorithm result to recover the scale, gravity, speed and deviation so as to guide the VIO based on nonlinear optimization.
The visual SFM algorithm based on the sliding window is initialized, and comprises characteristic point tracking, camera pose estimation and optimization, for the characteristic point tracking, the characteristic corresponding relation between the latest frame image of the image collected by the head-mounted augmented reality system and all the frame images in the sliding window is checked, if the latest frame image and any frame image in the sliding window have more than a certain number of tracking characteristics and rotation compensation pixels, the latest frame image is subjected to relative rotation and scale translation by using the tracking characteristics and the rotation compensation pixels, all the characteristics in the two frames are triangulated, and if the latest frame image does not exist, the latest frame image is stored in the sliding window, and a new frame image is waited; for camera pose estimation, estimating the poses of all frame images in a sliding window by utilizing the triangulated features and adopting a PnP method to further obtain the camera poses; and the optimization is realized by minimizing the total weight projection error of all the feature observations by using a BA optimization method, so that the optimization estimation of the camera attitude is completed. P is the position coordinates of the real object in the real space corresponding to the feature points of the left and right frame images respectively acquired by the left eye camera and the right eye camera, and is called as the feature position, and the feature point P1、p2The projection of the real object on the left eye camera and the right eye camera respectively, and O1 and O2 are the centers of the left eye camera and the right eye camera respectively. P is represented by the formula P ═ X, Y, Z]TBased on pinhole camerasModel, feature points p1、p2Satisfies the relationship:
s1p1=K1P
s2p2=K2(RP+t),
wherein, K1、K2Reference matrix of left eye and right eye cameras respectively, R, t rigid body conversion matrix and bias matrix between left eye and right eye cameras respectively, s1、s2The least square solution of P is calculated for the proportionality coefficients of the left eye camera and the right eye camera respectively as follows:
Figure BDA0003476374720000121
thereby obtaining a feature location estimation result. And according to the feature point matching information and the feature position estimation, acquiring the three-dimensional space position of the current feature point and the projection position of the current feature point on the camera by utilizing an SFM algorithm, and then estimating the image posture by directly utilizing a PnP algorithm.
The calibration of the inertial measurement equipment comprises the calibration of the bias of the gyroscope, the optimization of gravity vectors, and the calibration of speed, gravity vectors and scales. The gyroscope bias is calibrated by considering two continuous frame images b in the image windowkAnd bk+1Obtaining the corresponding rotation compensation pixel by using visual SFM algorithm
Figure BDA0003476374720000122
And
Figure BDA0003476374720000123
will frame image bk+1Time-to-frame image bkThe offset of the inertial measurement at that time is recorded as
Figure BDA0003476374720000124
Recording the result of the pre-integration of the corresponding inertial measurement results as
Figure BDA0003476374720000125
Linearizing a pre-integral term about the gyroscope bias in the inertial measurement equipment, and constructing a minimization loss function as follows:
Figure BDA0003476374720000126
Figure BDA0003476374720000127
wherein, B is the set of all frame image sequence numbers in the image window. Using biased jacobian matrices
Figure BDA0003476374720000128
To obtain
Figure BDA0003476374720000129
Solving the minimum loss function to obtain the gyroscope bias b by the first-order approximation of the gyroscope bias and delta as a calibration coefficientωInitial calibration of (1).
After the gyro is biased and initialized, the speed, the gravity vector and the actual scale are continuously calculated, and the quantity to be calculated is expressed as:
Figure BDA00034763747200001210
wherein,
Figure BDA00034763747200001211
is the velocity of the target in the body coordinate system when the image of the k-th frame is acquired,
Figure BDA00034763747200001212
is to obtain the c0The gravity vector measured during the frame of the image, s is the scaling factor for scaling the visual SFM algorithm to metric units.
The combined calibration is carried out on the vision sensor and the inertia measuring equipment, CobjRepresenting objectsA coordinate system, wherein the object coordinate system is taken as a world coordinate system; ccRepresenting a vision sensor coordinate system; ciRepresenting the inertial measurement unit coordinate system. C can be obtained by a visual sensor calibration methodcAnd CobjRelative position of (2), here by RaAnd taRepresenting the pose and 3 translation basis components of the vision sensor relative to the world coordinate system, through CiAnd CcRelative position t of the middlexObtaining CiAnd CobjRelative position R ofx. The calibration of the vision sensor and inertial measurement unit is carried out by observing a known calibration reference in space at different positions to derive RxAnd txThe relationship of the results was observed several times.
The visual sensor and the inertial measurement unit are jointly calibrated, multi-dimensional parameter information needs to be obtained from an inertial measurement unit, and the implementation process comprises the following steps: firstly, directly acquiring 3D information of the three-dimensional object, namely directly acquiring a set of target attitude angles represented by three Euler angles from an inertial measurement device, and angular velocity and linear acceleration of raw data, so as to acquire required target attitude time txThen, assume the target initial position is zero, i.e. the initial position s00, target position s after time Deltat of initial velocity v1By a linear formula s1=vΔt+1/2·aΔt2And estimating, wherein a is target acceleration, and the multidimensional parameter information comprises a target attitude angle, an original data angular velocity, linear acceleration, target attitude time and a target position.
The method comprises the following steps of utilizing measurement data acquired by a vision sensor and inertial measurement equipment to perform fusion positioning on a head-mounted augmented reality system, wherein the fusion positioning comprises measurement preprocessing, vision and inertial data fusion, repositioning and global attitude map optimization.
The measurement preprocessing comprises visual front-end processing and IMU data pre-integration.
And (3) visual front-end processing, tracking the feature points of each frame of image through an LK sparse optical flow algorithm, simultaneously detecting and obtaining the corner feature points of each frame of image, keeping acquiring 300 features from each frame of image, and enabling the features in each frame of image to be uniformly distributed by setting the minimum distance of pixels between two adjacent features.
For the corner feature point, the detection is realized by using a second derivative matrix of the image gray intensity, wherein the second derivative matrix of the image gray intensity is expressed as:
Figure BDA0003476374720000141
wherein p represents a pixel point in the image, and I represents the gray scale intensity at the pixel point p. The method comprises the following steps of obtaining a second derivative image by calculating the second derivative of the gray intensity of all pixel points in a frame of image, and calculating the autocorrelation matrix of the second derivative image in a fixed window around each pixel point to obtain:
Figure BDA0003476374720000142
wherein, wi,jIs the normalized weight ratio of the pixels of the ith row and jth column within a fixed window to produce a circular window and Gaussian weights, K is the number of pixels per row or per column in the fixed window, Ix(x, y) and Iy(x, y) represents the gray intensity of two pixels used to calculate the correlation at the x-th row and y-th column of the fixed window around it. And calculating the characteristic value of the autocorrelation matrix at each pixel point, and if the minimum characteristic value is still greater than the set minimum threshold value, taking the pixel point as the corner characteristic point.
Tracking the characteristic points of each frame of image through an LK sparse optical flow algorithm, and selecting a key frame image by adopting an average parallax standard or a tracking quality standard; in the average parallax criterion, if the average parallax of the tracking features between the current frame image and the latest key frame image exceeds 10 °, the current frame image is taken as a new key frame image, and the image rotation is compensated by using the short-term integral value of the measurement result of the inertial measurement device in calculating the average parallax. In the tracking quality criterion, if the number of tracking features is below a certain set value, the current frame image is taken as a new key frame image.
The IMU data pre-integration is characterized in that an IMU integration model is utilized, an IMU pre-integration algorithm is adopted, and an observed value of inertia measurement data, a covariance matrix of residual errors and a Jacobian matrix of the residual errors are calculated, wherein the observed value is used for calculating the residual errors, the Jacobian matrix of the residual errors is a gradient direction for optimizing data fusion, and the covariance matrix is a weight value corresponding to the observed value and is used for fusing the inertia measurement data and visual data.
The IMU integral model directly obtains the inertia data measured value in a body coordinate system, so that the measured value contains gravity information and the actual inertia motion information of the head-mounted augmented reality system and is subjected to acceleration bias baGyro offset bωAnd the effects of additive noise.
Combining with image frames bkAnd bk+1Corresponding to two time points, calculating the time interval t by inertia measurementk,tk+1]Position, speed and orientation states of the internal head-mounted augmented reality system in a world coordinate system.
The IMU pre-integration algorithm only compares the linear acceleration after changing the reference coordinate system from the world coordinate system to the body coordinate system
Figure BDA0003476374720000151
And angular velocity
Figure BDA0003476374720000152
The relevant variables are pre-integrated.
The visual and inertial data fusion process described, using two successive frame images b in a sliding windowkAnd bk+1According to the IMU measurement model, the residual error of the pre-integrated inertial measurement data is calculated by using the accelerometer and gyroscope measurement values containing noise in the time interval between two continuous frames of images.
The visual and inertial data fusion processing adopts a marginalization method to reduce the computational complexity of the VIO based on the nonlinear optimization, namely, the selective VIOState quantity x of earth inertia measured datakAnd a characteristic quantity lambdalMarginalizing from the sliding window while converting quantities corresponding to the edge states into a priori data, so that spatially separated keyframes remain in the sliding window while providing sufficient disparity for feature triangulation. When the next new frame image is the key frame image, the frame is retained in the sliding window, and the corresponding measurement value of the earliest frame image is marginalized. If the next new frame image is a non-key frame image, only the inertial measurement data is retained and the visual measurement data is discarded, and the inertial measurement data is associated with this non-key frame image. The marginalization calculation is done with Schur compensation and a new a priori value is constructed with all marginalized measurement data related to the removed state and added to the existing a priori values.
In the repositioning process, loop detection is realized by using a bag-of-words model, the bag-of-words model is firstly established according to feature point data actually acquired in a space operating environment, when the position of a visual sensor changes, a new feature point set is obtained by real-time calculation, the feature point set is matched with a key frame feature description vector after forming a new description vector, and if the matching is successful, loop detection is realized. Recognizing the visited positions by adopting a bag-of-words model, then establishing a characteristic level connection between a candidate closed loop and a current frame image, and integrating a characteristic pair into a visual-inertial odometer VIO based on nonlinear optimization to obtain a drift-free target state estimation. By matching a feature point set obtained at a certain position with a bag of words, features at the position are described as
A=a1w1+a2w2+a3w3+...+anwn
Wherein wiDenotes the i-th feature vector in the bag of words, aiIndicates the frequency of occurrence of the feature, [ a ]1,a2,...,an]I.e. describing vectors, the L1 norm is used to compare the distance between describing vectors, i.e.
Figure BDA0003476374720000161
Wherein a and b are description vectors to be compared, and W is the dimension of the description vectors.
The bag-of-words model is constructed by adopting a clustering method, and the bag-of-words is expressed by using a k-ary tree. The method comprises the steps of utilizing feature point data obtained by actually collecting all features obtained in a preliminary test, and randomly selecting a plurality of feature point data from the feature point data to serve as samples for clustering. The clustering method is realized by adopting a K-Mean algorithm, the clustering number K is determined, K D-dimensional feature point data vectors are randomly selected as initial class centers, and D is the dimension of the feature point data; calculating the distance between each feature point data vector and a clustering center, and selecting the closest clustering center as the class to which the feature point data vector belongs; within the same class, the cluster center is recalculated, and the calculation and classification processes are iterated until the sample data is converged. Loop back detection is implemented using the TF-IDF algorithm.
When the loop is detected, the feature descriptors of the BRIEF are matched, the feature consistency is checked, and the connection between the local sliding window and the loop candidate item is established. Removing outliers by adopting a geometric outlier removing method, performing a basic matrix test by using a RANSAC algorithm, and performing the basic matrix test by using 2D feature retrieval in a current image and a candidate loop; PnP position detection is performed through a RANSAC algorithm, and PnP position detection is performed by using a 3D position with known characteristics in a local sliding window and a 2D projection in a candidate loop image.
The global pose graph is optimized, and after relocation, a local sliding window is moved and aligned with a past position. With the relocation result, registration of the past set of poses into a globally consistent configuration is ensured by global pose graph optimization. When the key frame is rimmed from the sliding window, adding the rimmed key frame into the pose graph, and taking the rimmed key frame as a new vertex in the pose graph, and connecting the new vertex with other vertexes in the pose graph through a sequence edge and/or a loop edge: the newly rimmed key frames will establish several sequence edges to associate themselves with the existing key frames, the sequence edges representing the relative transition between two key frames in the local sliding window, the values of which are taken directly from the VIO.
If the newly rimmed key frame has a loop connected with the existing key frame, the loop edge in the pose graph is connected with the loop frame, the loop edge only contains 4-degree-of-freedom relative pose transformation, and the value of the loop edge is obtained by using the repositioning result.
Example two:
the invention discloses a visual and inertial mixed pose tracking method of a head-mounted augmented reality system, which comprises the steps of firstly carrying out equipment initialization, including measurement preprocessing on a visual sensor and an inertial measurement device, carrying out combined calibration on the visual sensor and the inertial measurement device, then carrying out fusion positioning on the head-mounted augmented reality system by using measurement data acquired by the visual sensor and the inertial measurement device, and carrying out loop detection and global pose graph optimization on a positioning result to obtain a final pose estimation result.
The measurement preprocessing process comprises processing frame images acquired by a vision sensor and performing pre-integration on inertial measurement results corresponding to two continuous frame images;
the method comprises the steps of carrying out combined calibration on a vision sensor and an inertia measurement device, calibrating the inertia measurement device to obtain an initial estimation result of a target attitude, a speed, a gravity vector, a gyroscope bias and a three-dimensional characteristic position, guiding a Visual-Inertial odometer VIO (VIO) based on nonlinear optimization, and calculating an absolute scale factor, the gyroscope bias, an acceleration bias, a gravity acceleration and a relative speed of the inertia measurement device to the vision sensor at each measurement moment.
The fusion positioning process fuses the inertia measurement result of the pre-integration, the visual image feature observation result and the loop detection result, and realizes the fusion positioning of the head-wearing augmented reality system.
And the global pose graph is optimized, and the relocation result is utilized to carry out global optimization so as to eliminate drift.
Monocular vision inertial odometry is a highly non-linear system, whereas the scale information is difficult to observe directly by a monocular camera, so a good initial value is required to directly fuse the inertial and visual measurements. Meanwhile, the inertial system has the problems of bias, gravity acceleration required when pose is solved, accurate speed information required when the pose moves rapidly and the like, so that the absolute scale factor of the IMU to the vision, gyroscope bias, acceleration bias, gravity acceleration and the speed of each IMU at each moment need to be calculated in the initialization process. Because the system needs to be initialized under the motion condition in practical application, an online calibration mode needs to be adopted to ensure that the system can stably operate under various environments.
Usually visual-only SLAM or Structure from Motion (SfM), have good initialization properties, i.e. the system is guided by deriving initial values from the relative Motion.
In the device initialization process, the visual SFM algorithm based on the sliding window is initialized, the pose change of the head-mounted augmented reality system is calculated, the IMU pre-integration result is matched with the visual SFM algorithm result, and the scale, gravity, speed and deviation are restored to guide the VIO based on the nonlinear optimization, as shown in FIG. 2.
The visual SFM algorithm based on the sliding window is initialized, the visual SfM can automatically recover the camera motion and the scene structure by using a plurality of scene images, and the sliding window technology is adopted to control the computation complexity while the visual SfM initialization process based on the sliding window estimates the motion by using the visual SfM.
The visual SFM algorithm based on the sliding window is initialized, and comprises characteristic point tracking, camera pose estimation and optimization, for the characteristic point tracking, firstly, the characteristic corresponding relation between the latest frame image of the image collected by the head-mounted augmented reality system and all frame images in the sliding window is checked, if the latest frame image and any frame image in the sliding window have more than a certain number of tracking characteristics (more than 30 tracking characteristics) and rotation compensation pixels (more than 20 rotation compensation pixels), the five-point algorithm is adopted to carry out relative rotation and scale translation on the latest frame image by using the tracking characteristics and the rotation compensation pixels, all the characteristics in the two frames are triangulated, and if the latest frame image does not exist, the latest frame image is stored in the sliding window, and a new frame image is waited; for camera pose estimation, estimating the poses of all frame images in a sliding window by utilizing the triangulated features and adopting a PnP method to further obtain the camera poses; and the optimization is realized by minimizing the total weight projection error of all the feature observations by using a BA optimization method, so that the optimization estimation of the camera attitude is completed. Where feature point tracking has been described above, the camera pose estimation is solved using an epipolar geometry approach, and the optimization will be set forth in the global pose graph optimization.
FIG. 3 is a schematic view of epipolar geometric constraint, where P is the position coordinates of the object in the real space corresponding to the feature points of the left and right frame images respectively acquired by the left-eye and right-eye cameras, and P is called the feature position, and the feature point P is the feature point P1、p2The projection of the real object on the left eye camera and the right eye camera respectively, and O1 and O2 are the centers of the left eye camera and the right eye camera respectively. Since the positions of the left and right objects of the binocular camera are accurately measured and are known values, the distance p can be used1、p2The depth of P is estimated. P is represented by the formula P ═ X, Y, Z]TAccording to the pinhole camera model, then the feature point p1、p2Satisfies the relationship:
s1p1=K1P
s2p2=K2(RP+t),
wherein, K1、K2Reference matrix of left eye and right eye cameras respectively, R, t rigid body conversion matrix and bias matrix between left eye and right eye cameras respectively, s1、s2The least square solution of P is calculated for the proportionality coefficients of the left eye camera and the right eye camera respectively as follows:
Figure BDA0003476374720000191
thereby obtaining a feature location estimation result. And according to the feature Point matching information and the feature position estimation, acquiring the three-dimensional space position of the current feature Point and the projection position of the current feature Point on the camera by using an SFM algorithm, and directly estimating the image posture by using a PnP (passive-n-Point) algorithm.
Assuming that a certain characteristic point P existing in a known space is observed at a certain position, the homogeneous coordinate of the certain characteristic point P is [ X, Y, Z,1 ]]TProjecting the feature point x to the left eye of the binocular vision sensor to obtain the feature point x1=[u1,v1,1]TDefining an augmentation matrix T ═ R | T]Is a 3 x4 matrix, which contains translation and rotation information, so the projection equation is:
sx1=K[R|t]P
where K is a known camera reference matrix. Since K is known, normalized plane coordinates can be used to remove the influence of the internal reference matrix K, and the above equation is abbreviated and expanded as:
Figure BDA0003476374720000201
the scaling factor s is eliminated with the last row, resulting in two constraints:
Figure BDA0003476374720000202
thus, there are:
Figure BDA0003476374720000203
because T is 12 unknowns in total, at least 6 groups of 3D-2D corresponding points are needed, 18 points are randomly selected from the currently measured feature points for more accurately estimating the pose of the camera, less than 18 points are all selected, and less than 6 points cannot be calculated, so that the pose is discarded. When the number of the redundant points is 6, the method adopts SVD (singular value decomposition) to solve the least square solution. A linear equation set is constructed using the set of 18 3D-2D corresponding points:
Figure BDA0003476374720000216
namely:
Figure BDA0003476374720000217
note the book
Figure BDA0003476374720000218
Is in the dimension of 18 x 12, and has the following structure,
Figure BDA0003476374720000219
is 11X 1 dimension, i.e.
Figure BDA00034763747200002110
Singular value decomposition of A, i.e.
Figure BDA00034763747200002111
The singular vector in V corresponding to the smallest singular value in the singular value matrix S is the least square solution, i.e., the least square solution of the camera pose estimation.
The IMU calibration generates equivalent values of gyroscope bias, speed, gravity acceleration and scale, and optimizes the gravity acceleration to provide accurate initial parameters for the vision-inertia combined calibration.
The calibration of the inertial measurement equipment comprises the calibration of the bias of the gyroscope, the optimization of gravity vectors, and the calibration of speed, gravity vectors and scales. The gyroscope bias is calibrated by considering two continuous frame images b in the image windowkAnd bk+1And obtaining the corresponding rotation compensation pixel by using visual SFM algorithm
Figure BDA0003476374720000221
And
Figure BDA0003476374720000222
will frame image bk+1Time-to-frame image bkThe offset of the inertial measurement at that time is recorded as
Figure BDA0003476374720000223
Recording the result of the pre-integration of the corresponding inertial measurement results as
Figure BDA0003476374720000224
Linearizing a pre-integral term about the gyroscope bias in the inertial measurement equipment, and constructing a minimization loss function as follows:
Figure BDA0003476374720000225
Figure BDA0003476374720000226
wherein, B is the set of all frame image sequence numbers in the image window. Using biased jacobian matrices
Figure BDA0003476374720000227
To obtain
Figure BDA0003476374720000228
Solving the minimum loss function to obtain the gyroscope bias b by the first-order approximation of the gyroscope bias and delta as a calibration coefficientωInitial calibration of (1).
After the gyro is biased and initialized, the speed, the gravity vector and the actual scale are continuously calculated, and the quantity to be calculated is expressed as:
Figure BDA0003476374720000229
wherein,
Figure BDA00034763747200002210
is the velocity of the target in the body coordinate system when the image of the k-th frame is acquired,
Figure BDA00034763747200002211
is to obtain the c0The gravity vector measured during the frame of the image, s is the scaling factor for scaling the visual SFM algorithm to metric units. Considering two consecutive frames of images b in an image windowkAnd bk+1Corresponding pre-integral terms thereof
Figure BDA00034763747200002212
And
Figure BDA00034763747200002213
the calculation formula of (2) is as follows:
Figure BDA00034763747200002214
Figure BDA00034763747200002215
the following linear measurement models were combined:
Figure BDA0003476374720000231
wherein
Figure BDA0003476374720000232
Wherein,
Figure BDA0003476374720000233
is obtained from the scale-matched monocular vision SfM, Δ tkIs the time interval between two consecutive frames. By solving a linear least squares problem:
Figure BDA0003476374720000234
namely, parameters such as the speed of each frame in the body coordinate system, the gravity vector in the visual reference system, the scale and the like can be obtained from the sliding window.
The described gravity vector optimization, since the gravity vector is known, the gravity vector obtained by the linear initialization step can be optimized by limiting the norm. Since the gravity vector is known, the gravity vector has only 2 degrees of freedom left. The gravity can therefore be re-parameterized with two variables in its tangent space.
The gravity vector optimization firstly expresses the gravity vector as
Figure BDA0003476374720000235
Wherein, g is a known magnitude of gravity,
Figure BDA0003476374720000236
is a unit vector representing the direction of gravity, b1And b2Is that
Figure BDA0003476374720000237
Two orthogonal bases of the unfolded tangent plane, as shown in FIG. 4, w1And w2Respectively correspond to b1And b2Displacement of (2).
By the cross product operation of algorithm 1, a set of b1, b2 can be found. Then use
Figure BDA0003476374720000238
Replacing g in the formula and solving for w together with other state variables1And w2. Iterating the process until
Figure BDA0003476374720000239
And (6) converging.
Since the magnitude of the earth's gravity is known, g is located at a radius g ≈ 9.81m/s2On the spherical surface of (a). Parameter of gravity into
Figure BDA00034763747200002310
Wherein b is1And b2Are the two orthogonal bases of the spanned tangent plane.
Figure BDA0003476374720000241
After the gravity vector is refined, the world coordinate system and the visual sensor coordinate system c can be obtained by rotating the gravity to the z axis0Between
Figure BDA0003476374720000242
Then all variables are converted from the reference coordinate system
Figure BDA0003476374720000243
Rotation to world coordinate System (.)wThe velocity in the body coordinate system is also rotated to the world coordinate system.
The calibration of the vision sensor and the inertial measurement unit is to determine the 6D information of the relative attitude angle and position between the vision sensor and the inertial measurement unit after establishing the coordinate system of the vision sensor and the inertial measurement unit respectively. Because the original values in the coordinate system of the inertial measurement unit are directly obtained from the inertial measurement unit, the values obtained from the inertial measurement unit need to be converted into relevant data required by the visual sensor in the process of realizing the hybrid tracking of vision and inertia, and the data conversion relates to the relative 6D relationship between the two coordinate systems.
The combined calibration is carried out on the vision sensor and the inertia measuring equipment, CobjRepresenting an object coordinate system, and regarding the object coordinate system as a world coordinate system; ccRepresenting a vision sensor coordinate system; ciRepresenting the inertial measurement unit coordinate system. As shown in FIG. 5, C can be obtained by the visual sensor calibration methodcAnd CobjRelative position of (2), here by RaAnd taRepresenting the pose and 3 translational fundamental components of the vision sensor relative to the world coordinate system, through CiAnd CcRelative position t of the middlexObtaining CiAnd CobjRelative position R ofx. Due to vision sensor and inertiaWhen the sex measuring device is relatively stationary, R is knownxAnd txIs a constant. The calibration of the vision sensor and inertial measurement unit is carried out by observing a known calibration reference in space at different positions to derive RxAnd txThe relationship of the results was observed several times. FIG. 5 shows the slave position PaMove to position PbThe relative position between the coordinates of time.
The 4X4 matrix is represented by A, B, C, D and X, and respectively describes the relative orientation between two coordinate systems, if the relative orientation between the two coordinate systems has a rotation matrix RaAnd a translation vector taDescription of the invention
Figure BDA0003476374720000251
Wherein R isaAnd taThe subscript of (a) denotes the name of the matrix. In the figure CobjCoordinate system representing a calibration reference (i.e. the world coordinate system referred to earlier), Cc1And Ci1Represents t1Time of day vision sensor coordinate system and inertial measurement unit coordinate system, Cc2And Ci2Represents t2These two coordinate systems of time of day. Cc1、Cc2Relative to CobjThe extrinsic parameters can be determined by a vision sensor calibration method, denoted by a, B, respectively. If C represents Cc1And Cc2Relative orientation between (i.e. vision sensor by C)c1Move to Cc2Rotation and translation of C ═ AB), then C ═ AB-1。Ci1And Ci2Can be obtained directly from an inertial measurement unit, D represents Ci1And Ci2Relative orientation therebetween (i.e. inertial measurement unit consisting of Ci1Move to Ci2Rotation and translation in time). Cc1And Ci1Is X, then Cc2And Ci2Is also X, so X is a fixed value. Setting a certain point P in space as the above four coordinate systems Cc1,Cc2,Ci1,Ci2Respectively has a coordinate of pc1,pc2,pi1,pi2The following relationship is present:
pc1=Cpc2
pc1=Xpi1
pi1=Dpi2
pc2=Xpi2
the following can be obtained:
pc1=CXpi3
pc1=XDpi2the basic equation for calibrating the vision sensor and the inertial measurement unit is obtained, where CX ═ XD, and the relative positions X of the vision sensor and the inertial measurement unit are unchanged in the preceding and following periods.
If each 4X4 matrix of the formula CX ═ XD is written out with the corresponding rotation matrix and translation vector, respectively, then it can be written as:
Figure BDA0003476374720000261
the expansion formula can be obtained:
RcRx=RxRd
Rct+tc=Rxtd+t
known in the above formula is Rc,Rd,tc,tdAnd R isx,Re,RdAre all orthogonal identity matrices, the solution required is RxAnd tx
From equation RcRx=RxRdIt is possible to obtain:
Figure BDA0003476374720000263
in the formula, RxIs a unit orthogonal matrix which defines the motion of any point in space, the motion can be regarded as rotating a theta angle around a certain k axis, the k axis and the theta angle are uniquely determined by an R matrix, therefore RxThe matrix can be represented as Rx(k,θ),k is a unit vector indicating the direction of the rotation axis, and the positive and negative of the angle θ are based on the right-hand rule. If R isx=Rx(k, θ), then:
Figure BDA0003476374720000262
namely RxAre 1, e respectively,e-jλ. U is RxWherein the eigenvector corresponding to eigenvalue 1 is k (i.e., the first column of U is k). Can know that RdAnd RcAre similar matrices and have the same eigenvalues. Since the eigenvalues of the rotation matrix are uniquely determined by the rotation angle. So RdAnd RcThe defined rotation angles must be equal, and if we denote this rotation angle as θ, we can write the equation:
Rc(kc,θ)Rx=RxRd(kd,θ)
wherein k iscAnd k isdAre each RcAnd RdA defined axis of rotation. Note that in practice RcAnd RdAre known data in the equation that must satisfy RcAnd RdThe condition that the defined rotation angle is equal to the determined RxIs irrelevant.
If R isc(kc,θ)Rx=RxRd(kdθ), then kc=RkdAnd if
Figure BDA0003476374720000271
Is a special solution thereof, then all solutions R of the above formulaxCan all be expressed as
Figure BDA0003476374720000272
Wherein k iscIs RcA rotating shaft of (a); beta is an arbitrary angle.
From the above formula R cannot be solved uniquelyxThere is aThe individual degrees of freedom β are to be determined.
Can be represented by the formula Rct+tc=Rxtd+ t is written as:
(Ro-I)t=Rxtd-tc
in the formula, Rc、tc,tdThe method comprises the following steps of (1) knowing; i is a unit diagonal matrix; t is txIs the parameter to be solved. From the properties of the rotation matrix described above, one can derive:
Figure BDA0003476374720000273
it can be seen that RcOne of the characteristic values of-is other than 0, when θ is not equal to 0 (i.e., R)cWhen not an identity matrix), RcI is a rank 2 matrix, thus, even RxCan be solved by the formula, only the relation t is givenxTwo independent linear equations of three components, so txCannot be uniquely determined by the formula, and has a degree of freedom to be determined.
If two movements are made during the calibration process (the coordinate system positions before and after the two movements are shown in fig. 6), the following four relations can be obtained:
Rc1Rx=RxRd1,Rc1tx+tc1=Rxtd1+tx,Rc2Rx=RxRd2,Rc2tx+tc2=Rxtd2+tx
wherein R isc1,tc1,Rc2,tc2Parameters of two movements are respectively given by external parameters obtained by a three-time vision sensor; rd1,tc1,Rd2,td2Given by the two inertia measurement devices. Jointly determining R from the above formulaxThen R is addedxSolve t by substitutionx. Due to the solution of RxThe following two equations must be satisfied simultaneously:
kc1=Rxkd1
kc2=Rxkd2
wherein k isc1,kc2,kd1,kd2Are each Rc1,Rc2,Rd1,Rd2A unit vector in the determined rotation axis direction. Due to RxSimultaneously adding Rd1Go to kc1,kd2Go to kc2Then must RxWill kd1×kd2Go to kc1×kc2The relationship is put in a matrix equation at the same time to obtain
(kc1 kc2 kc1×kc2)=Rx(kd1 kd2 kd1×kd2)
When k isc1And k isc2When not parallel to each other, the matrix in the above formula is a full rank matrix, and then
Rx=(kc1 kc2 kc1×kc2)(kd1 kd2 kd1×kd2)-1
R solved in the formulaxSubstitution, one can obtain information about txAny three of the four independent linear equations of (1) can be solved for txOr solving t from four equations by least squaresx. It is particularly noted that the above-mentioned two movements Rd1And Rd2When the rotating shaft portions are parallel to each other, R is solvedxIs unique, it can be seen that when R isc1And Rc2When the diagonal matrix is non-unity (i.e. motion is not pure translation), txAnd is also uniquely determined.
The calibration between the visual sensor and the inertial measurement unit can be completed only by data of three positions of the visual sensor and the inertial measurement unit. In order to reduce the calibration error, calibration is performed by using data of multiple measurements, and the data is expressed by a quaternion form as follows:
Figure BDA0003476374720000281
dot product on unit quaternion matrix
Figure BDA0003476374720000282
Is two quaternions q1And q is2Similarity measurements, and therefore the optimal rotation matrix for the above equation is:
Figure BDA0003476374720000283
this can be finally expressed by the above formula:
Figure BDA0003476374720000291
Figure BDA0003476374720000292
the final goal is to solve the above equation. If a rough estimate of the equation is made, it is recombined in the following linear form:
ΔQd=(qc1,…,qcn)
ΔQc=(qd1,…,qdn)
here, n measurements are represented by one 4X4, and
Figure BDA0003476374720000293
then the equation can be expressed as:
ΔQd≡SΔQc
rotating the matrix R by the above methodxIf the formula is replaced, t can be representedxAnd (4) obtaining.
The method is used for jointly calibrating a visual sensor and an inertial measurement unit, and needs to obtain multi-dimensional parameter information from an inertial measurement unit, and the implementation process comprises the following steps: firstly, the 3D information is directly obtained, namely a group of three Euros are directly obtained from an inertial measurement unitTarget attitude angle expressed by pull angle, and raw data angular velocity and linear acceleration, so as to obtain required target attitude time txThen, assume that the target initial position is zero, i.e. initial position sc0, target position s after time Deltat of initial velocity v1By a linear formula s1=vΔt+1/2·aΔt2And estimating, wherein a is target acceleration, and the multidimensional parameter information comprises a target attitude angle, an original data angular velocity, linear acceleration, target attitude time and a target position.
The method comprises the following steps of utilizing measurement data acquired by a vision sensor and inertial measurement equipment to perform fusion positioning on a head-mounted augmented reality system, wherein the fusion positioning comprises measurement preprocessing, vision and inertial data fusion, repositioning and global attitude map optimization.
The measurement preprocessing comprises visual front-end processing and IMU data pre-integration.
And (3) visual front-end processing, tracking the feature points of each frame of image through an LK sparse optical flow algorithm, simultaneously detecting and obtaining the corner feature points of each frame of image, keeping acquiring 300 features from each frame of image, and enabling the features in each frame of image to be uniformly distributed by setting the minimum distance of pixels between two adjacent features.
The feature points of the corner points are tracked as local features because the feature points have strong uniqueness in a local range. For the corner feature point, the detection is realized by using a second derivative matrix of the image gray intensity, namely a two-dimensional Hessian matrix, wherein the second derivative matrix of the image gray intensity is expressed as:
Figure BDA0003476374720000301
wherein p represents a pixel point in the image, and I represents the gray scale intensity at the pixel point p. The method comprises the following steps of obtaining a second derivative image by calculating the second derivative of the gray intensity of all pixel points in a frame of image, and calculating the autocorrelation matrix of the second derivative image in a fixed window around each pixel point to obtain:
Figure BDA0003476374720000302
wherein, wi,j is the normalized weight ratio of the pixels in row I and column j within the fixed window to produce a circular window and a Gaussian weight, K is the number of pixels per row or column in the fixed window, Ix(x, y) and Iy(x, y) represents the gray intensity of two pixels used to calculate the correlation at the x-th row and y-th column of the fixed window around it. And calculating the characteristic value of the autocorrelation matrix at each pixel point, and if the minimum characteristic value is still greater than the set minimum threshold value, taking the pixel point as the corner characteristic point.
The invention adopts an LK optical flow method to track the characteristic points. The optical flow describes the motion of pixels in an image, in LK optical flow the image from the camera is time-varying and therefore can be seen as a function of time: i (t), i.e. at time t, the pixel at (x, y) has a grey scale I (x, y, t), the method considers the image as a function of position and time, the value range of which is the grey scale of the pixel in the image. For a certain fixed point at time t, the pixel coordinate is (x, y), and since the motion image of the camera changes, in order to estimate the motion of the pixel, the present invention assumes that the gray value of the same point in space on the image is fixed, that the pixels in a certain window have the same motion, and that the motion of the camera is slow.
For a pixel at (x, y) at time t, it moves to (x + dx, y + dy) at time t + dt, but its gray level is unchanged, i.e., it is
I(x+dx,y+dy,t+dt)=I(x,y,t)
Taylor expansion at I (x, y, t) to the left as
Figure BDA0003476374720000311
Wherein n is(1)Is high order infinitesimal, and can be obtained after neglect
I′xdx+I′ydy+I′tdt=0,
Both sides are divided by dt to give:
Figure BDA0003476374720000312
wherein
Figure BDA0003476374720000313
Is the speed of movement of the pixel in the x-axis, and
Figure BDA0003476374720000314
the velocity of motion on the y-axis is denoted as u, v. I'xIs the gradient, l ', of the image in the x-direction at that point'yIs a gradient in the y-direction, I'tFor the time variation of the image gray scale, the matrix form is written as follows:
Figure BDA0003476374720000315
consider a window of size w x w, containing w2And (4) a pixel. Since the pixels within the window have the same motion, w is obtained2The equation:
Figure BDA0003476374720000321
written as a matrix:
Figure BDA0003476374720000322
recording:
Figure BDA0003476374720000323
by solving the least squares solution of the above equation, an estimate of pixel motion is derived
Figure BDA0003476374720000324
Tracking the characteristic points of each frame of image through an LK sparse optical flow algorithm, and selecting a key frame image by adopting an average parallax standard or a tracking quality standard; in the average parallax criterion, if the average parallax of the tracking features between the current frame image and the latest key frame image exceeds 10 °, the current frame image is taken as a new key frame image, and the image rotation is compensated by using the short-term integral value of the measurement result of the inertial measurement device in calculating the average parallax. Both translation and rotation result in parallax, and since features in purely rotational motion cannot be triangulated and cannot be used to estimate relative depth, short-term integrals of gyroscope measurements are used to compensate for rotation in calculating parallax. Because the rotation compensation is only used for selecting the key frame, even if the gyroscope has larger noise or bias, the unreasonable key frame selection result can be only caused, and the pose estimation quality can not be directly influenced. In the tracking quality standard, if the number of tracking features is lower than a certain set value, the current frame image is taken as a new key frame image to avoid the complete loss of the feature track. The set value may be 100.
The IMU data pre-integration is characterized in that an IMU integration model is utilized, an IMU pre-integration algorithm is adopted, and an observed value of inertia measurement data, a covariance matrix of residual errors and a Jacobian matrix of the residual errors are calculated, wherein the observed value is used for calculating the residual errors, the Jacobian matrix of the residual errors is a gradient direction for optimizing data fusion, and the covariance matrix is a weight value corresponding to the observed value and is used for fusing the inertia measurement data and visual data.
The output frequency of the IMU is 200Hz, the optimization variable can be rapidly increased and the optimization real-time performance is damaged due to the rapid and large amount of data, the IMU sampling data between two frames of images is changed into a constant by a pre-integration method, the optimization variable is greatly reduced, and meanwhile, a covariance propagation equation is deduced by utilizing continuous time IMU error state dynamics.
Said IMU integral model, IMU raw gyroscope and accelerometer measurements, i.e.
Figure BDA0003476374720000331
And
Figure BDA0003476374720000332
is represented as follows:
Figure BDA0003476374720000333
Figure BDA0003476374720000334
the directly obtained inertial data measurement value is in the body coordinate system, so that the inertial data measurement value contains gravity information and actual inertial motion information of the head-mounted augmented reality system and is subjected to acceleration bias baGyro offset bωAnd the effects of additive noise. The acceleration and gyroscope measurements are assumed to contain gaussian noise, i.e.:
Figure BDA0003476374720000335
the bias terms of acceleration and gyro are both subject to a random walk model, namely:
Figure BDA0003476374720000336
its derivative follows a gaussian distribution, i.e.:
Figure BDA0003476374720000337
combining with image frames bkAnd bk+1Corresponding to two time points, calculating the time interval t by inertia measurementk,tk+1]The position, the speed and the orientation state of the inner head-mounted augmented reality system under a world coordinate system are calculated according to the following formula:
Figure BDA0003476374720000341
Figure BDA0003476374720000342
Figure BDA0003476374720000343
wherein:
Figure BDA0003476374720000344
Δtkis [ tk,tk+1]The time interval in between.
It can be seen that IMU state propagation requires frame bkThe rotation, position and speed of the rotor. Therefore, in the global attitude map optimization algorithm adopted by the invention, the state propagation equation between two adjacent frames needs to be recalculated and modified each time the head-mounted augmented reality system adjusts the attitude. And therefore require significant computational resources. To avoid re-propagation, the present invention employs a pre-integration algorithm.
The IMU pre-integration algorithm only compares the linear acceleration after changing the reference coordinate system from the world coordinate system to the body coordinate system
Figure BDA0003476374720000345
And angular velocity
Figure BDA0003476374720000346
The relevant variables are pre-integrated as follows:
Figure BDA0003476374720000347
Figure BDA0003476374720000348
Figure BDA0003476374720000349
wherein:
Figure BDA00034763747200003410
Figure BDA00034763747200003411
Figure BDA00034763747200003412
obviously, with bkThe pre-integral term may be obtained from IMU measurements only for the reference frame,
Figure BDA00034763747200003413
only related to deviations of IMU, bkAnd bk+1When the estimate of the deviation changes, if the change is small, it can be adjusted by its first approximation with respect to the bias
Figure BDA0003476374720000351
Otherwise, it will be recalculated, which can save a lot of computing resources for optimization-based algorithms since it is not necessary to repeatedly calculate IMU bias from IMU measurements.
The invention selects a Euler integration method to realize discrete numerical integration for the computer. In the initial state of the process, the temperature of the molten steel is controlled,
Figure BDA0003476374720000352
is a non-volatile organic compound (I) with a value of 0,
Figure BDA0003476374720000353
is a quaternary element of identityAnd (4) counting. The average value of α, β, γ in the formula is calculated stepwise. Due to the additional noise term na,nωObeying a gaussian distribution, is considered zero in the present invention. In summary, the estimated value of the pre-integral term is:
Figure BDA0003476374720000354
Figure BDA0003476374720000355
Figure BDA0003476374720000356
wherein i corresponds to a value at [ tk,tk+1]The discrete moments measured by the IMU within. δ t time interval between two IMU measurements i and i + 1.
For the calculation of IMU Jacobian matrix and covariance matrix, quaternion is rotated due to four dimensions
Figure BDA0003476374720000357
Over-parameterized, its error term can be defined as the perturbation around its mean:
Figure BDA0003476374720000358
wherein
Figure BDA0003476374720000359
Is a three-dimensional small perturbation.
The continuous linear error kinetic term in the equation is therefore:
Figure BDA0003476374720000361
wherein,
Figure BDA0003476374720000362
the update is recursively computed by a first order discrete-time covariance having an initial covariance of
Figure BDA0003476374720000363
The recursive formula is:
Figure BDA0003476374720000364
where Q is the diagonal covariance matrix of the noise
Figure BDA0003476374720000365
At the same time with
Figure BDA0003476374720000366
In connection with
Figure BDA0003476374720000367
The first order Jacobian matrix of (1) can also be used as the initial Jacobian matrix
Figure BDA0003476374720000368
And (3) recursive calculation:
Jt+δt=(I+Ftδt)Jt,t∈[k,k+1],
by using the recursion formula, a covariance matrix is obtained
Figure BDA0003476374720000369
And Jacobian matrix
Figure BDA00034763747200003610
The first order approximation for the deviation can be written as:
Figure BDA00034763747200003611
Figure BDA00034763747200003612
Figure BDA00034763747200003613
wherein
Figure BDA00034763747200003614
Is that
Figure BDA00034763747200003615
The position of the sub-block matrix corresponds to
Figure BDA00034763747200003616
The same meaning is also used. When the estimation of the deviation slightly changes, the pre-integration result can be approximately corrected by the formula, the integration needs to be recalculated, and the calculation amount is reduced. Using covariance
Figure BDA00034763747200003617
Representing the IMU measurement model as follows:
Figure BDA0003476374720000371
the visual and inertial data fusion process described, using two successive frame images b in a sliding windowkAnd bk+1According to the IMU measurement model, using the accelerometer and gyroscope measurements containing noise in the time interval between two consecutive frames of images, calculating the residual error of the pre-integrated inertial measurement data as:
Figure BDA0003476374720000372
wherein [ ·]xyzThe vector portion of the extracted quaternion q represents the error state,
Figure BDA0003476374720000373
three-dimensional error being quaternionThe status is represented by a status indication,
Figure BDA0003476374720000374
the method is an IMU pre-integration measurement, the part uses accelerometer and gyroscope measurement values containing noise in a time interval between two continuous frames of images, and the online correction residual error also contains the deviation of the accelerometer and the gyroscope.
For the residual of the vision measurement, unlike the pinhole camera model that defines the reprojection error on the generalized image plane, the present invention measures the residual using a vision sensor projected onto a unit sphere. Considering the l-th feature observed in the i-th frame image for the first time, the residual error of feature observation in the j-th frame image is defined as:
Figure BDA0003476374720000381
Figure BDA0003476374720000382
Figure BDA0003476374720000383
wherein,
Figure BDA0003476374720000384
is the first observation of the ith feature in the ith frame of image,
Figure BDA0003476374720000385
is the observed value of the same feature in the image of the jth frame,
Figure BDA0003476374720000386
is a back-projection function that converts pixel locations to unit vectors using intrinsic parameters of the vision sensor. Since the degree of freedom of the visual residual is 2, the residual vector can be projected onto a tangent plane. b1,b2Is formed by stretching
Figure BDA0003476374720000387
Two arbitrarily chosen orthogonal bases of the tangent plane, as shown in figure 7,
Figure BDA0003476374720000388
is a standard covariance matrix with a fixed length in tangent space. Wherein,
Figure BDA0003476374720000389
is the unit vector that observes the l-th feature in the j-th frame. By converting the feature result observed for the first time in the ith frame into the jth frame, the feature measurement value on the unit sphere is predicted
Figure BDA00034763747200003810
Residual vector is in
Figure BDA00034763747200003811
On the tangent plane of (c).
The visual and inertial data fusion process adopts a marginalization method to reduce the computational complexity of the VIO based on the nonlinear optimization, namely, the state quantity x of the inertial measurement data is selectively usedkAnd a characteristic quantity lambdalMarginalizing from the sliding window while converting quantities corresponding to the edge states into a priori data, so that spatially separated keyframes remain in the sliding window while providing sufficient disparity for feature triangulation. As shown in fig. 8, when the next new frame image is the key frame image, the frame is retained in the sliding window, and the corresponding measurement value of the earliest frame image is marginalized. If the next new frame image is a non-key frame image, only the inertial measurement data is kept and the visual measurement data is discarded, and the inertial measurement data is associated with the non-key frame image, and in order to keep the sparsity of the system, all non-key frame measurements do not need to be marginalized. The marginalization calculation is done with Schur compensation and a new a priori value is constructed with all marginalized measurement data related to the removed state and added to the existing a priori values.
After the estimator is initialized, the state estimation of the head-mounted augmented reality system is performed by using the sliding window based tightly-coupled monocular VIO, as shown in fig. 9.
The complete state vector in the sliding window is:
Figure BDA0003476374720000391
Figure BDA0003476374720000392
Figure BDA0003476374720000393
wherein x iskIs to capture kthThe IMU state at the time of the frame image includes the position, speed and direction of the IMU in the world coordinate system, and the acceleration bias and gyroscope bias of the IMU in the body coordinate system. n is the total number of key frames and m is the total number of features in the sliding window. Lambda [ alpha ]lIs the inverse depth of the ith feature from its first observed position.
The invention adopts a clustering adjustment method facing to visual inertia to perform fusion optimization. The maximum a posteriori estimate is obtained by minimizing the sum of the prior value and the Mahalanobis norm of all measured residuals:
Figure BDA0003476374720000394
wherein, the Huber norm is:
Figure BDA0003476374720000395
wherein,
Figure BDA0003476374720000396
and
Figure BDA0003476374720000397
residual and visual measurements of the IMU, respectively, B is the set of all IMU measurements, C is the set of features observed at least twice in the current sliding window, { rp,HpIs a priori information from marginalization.
The sliding window and marginalization scheme effectively limits computational complexity while drifting in the global 3D position (X, Y, Z) estimation and rotation around the direction of gravity (yaw). To eliminate drift, the present invention employs a tightly coupled relocation method seamlessly integrated with a monocular VIO.
In the repositioning process, loop detection is realized by using a bag-of-words model, the bag-of-words model is firstly established according to feature point data actually acquired in a space operating environment, when the position of a visual sensor changes, a new feature point set is obtained by real-time calculation, the feature point set is matched with a key frame feature description vector after forming a new description vector, and if the matching is successful, loop detection is realized. Recognizing the visited positions by adopting a bag-of-words model, then establishing a characteristic level connection between a candidate closed loop and a current frame image, and integrating a characteristic pair into a visual-inertial odometer VIO based on nonlinear optimization to obtain a drift-free target state estimation. Because the relocation uses multiple observations of multiple features, the positioning accuracy and the state estimation smoothness are effectively improved. Wherein the description vector only concerns whether a certain feature appears or not, and does not consider the position where the feature appears, and the feature at the position is described as the feature in the position by matching the feature point set obtained at the position with the word bag
A=a1w1+a2w2+a3w3+...+anwn
Wherein wiDenotes the i-th feature vector in the bag of words, aiIndicates the frequency of occurrence of the feature, [ a ]1,a2,...,an]I.e. describing vectors, the L1 norm is used to compare the distance between describing vectors, i.e.
Figure BDA0003476374720000401
Wherein a and b are description vectors to be compared, and W is the dimension of the description vectors.
The bag-of-words model is constructed by adopting a clustering method, and in order to avoid too much time consumption in the characteristic matching process, a k-ary tree is used for representing the bag-of-words. The method comprises the steps of utilizing feature point data obtained by actually collecting all features obtained in a preliminary test, randomly selecting a plurality of feature point data from the feature point data as samples to perform clustering, and expressing the data by adopting a 10-fork tree with the depth of 5, so that 10 ten thousand features can be accommodated. Because the image obtained by the vision camera at each time has features which are generally not more than 500, the bag of words can relatively completely express the feature distribution condition of one image, and the 5-layer 10-fork tree description bag of words is as shown in fig. 10. The clustering method is realized by adopting a K-Mean algorithm, the clustering number K is determined, K D-dimensional feature point data vectors are randomly selected as initial class centers, and D is the dimension of the feature point data; calculating the distance between each feature point data vector and a clustering center (taking L1 norm), and selecting the nearest clustering center as the class to which the feature point data vector belongs; within the same class, the cluster center is recalculated, and the calculation and classification processes are iterated until the sample data is converged. The data of the first four layers are processed as above, and then the classified bag-of-words model can be obtained. Loop back detection is implemented using the TF-IDF algorithm. Intuitively, the loop detection can be judged only by considering whether the picture obtained at each position is consistent with the picture obtained before, namely whether the description vector is close enough. The invention considers that the importance of different characteristics on the distinction is different, and certain characteristics with particularly high occurrence Frequency actually cannot really represent the key information of a picture, so a TF-IDF (Term Frequency-Inverse Document Frequency) method is introduced, wherein TF, namely the Frequency of a certain characteristic appearing in a single picture is higher, the distinction degree of the characteristic on the picture is higher, and IDF refers to the lower Frequency of the certain characteristic appearing in a word bag, the more easily the picture with the characteristic appears is classified correctly. The IDF and TF parts are considered in establishing the bag of words:
Figure BDA0003476374720000411
Figure BDA0003476374720000412
where n is the number of all features and ni is the frequency of occurrence of the feature wi (i.e., a certain leaf node) in the sample used to create the bag of words. The weight η of wiiIs composed of
ηi=IDFi×TFi
Considering the weights, the new image A can be expressed as
Figure BDA0003476374720000413
Comparing the image A with the calculated frame, when the difference between the description characteristics is less than 5%, the invention considers that the image A reaches the position once, and then the loop detection is finished.
In fact, because the change of two adjacent frames is extremely small and the two adjacent frames are easy to be judged as loop, the loop detection is meaningless, so that the invention introduces a clustering mode, compares the images in a short period, discards the frames which are too close, and only retains the key frames, thereby greatly reducing the overhead of the system for loop detection. In addition to the corner features for monocular VIO, the system also detects an additional 500 corner points through BRIEF descriptors, and the additional corner point features improve the recall ratio of loop detection. The invention adopts a method of reserving all BRIEF descriptors for feature retrieval and discarding the original image to reduce the memory consumption and improve the system performance.
When a loop is detected, the feature descriptors of BRIEF are matched, feature consistency is checked, and connection between a local sliding window and loop candidates is established. However, since a large number of outliers may be generated by direct descriptor matching, outliers are removed by a geometric outlier removal method, a basis matrix test is performed by the RANSAC algorithm, and the basis matrix test is performed using 2D feature retrieval in the current image and the candidate loops; PnP position detection is performed through a RANSAC algorithm, and PnP position detection is performed by using a 3D position with known characteristics in a local sliding window and a 2D projection in a candidate loop image. When the number of inliers exceeds 50, this candidate is treated as a correct loop back detection and relocation is performed.
For tightly coupled repositioning, the repositioning process efficiently correlates the current sliding window and the historical pose graph maintained by the monocular VIO, during which repositioning the poses of all loop frames are processed to constants, and jointly optimizes the sliding window using IMU measurements, local vision measurements, and detecting feature consistency in the loop. Fig. 11(a) is a relocation process that starts with VIO-only pose estimation (blue), records the historical state (green), and if a loop back is detected in the latest frame (red line in fig. 2), a relocation occurs. Since relocation uses feature-level association, the system can easily check for loop-back constraints from multiple key-frames. Fig. 11(b) is a process of pose graph optimization, when the last key frame in the sliding window is rimmed, the key frame will be added to the pose graph, and if there is a loop between the key frame and other key frames, a loop constraint represented as a 4 degree of freedom versus rigid body transformation will be added to the pose graph. Features observed in the loop frame v have the same visual measurement model as in the VIO visual measurement, but the loop frame pose obtained from the pose graph or directly from the previous odometer output
Figure BDA0003476374720000431
Then processed as a constant and therefore the non-linear loss function in the equation is changed to:
Figure BDA0003476374720000432
where L is the set of feature points observed in the loop frame, and (L, v) is the L-th observed feature point in the loop frame v. Note that although the penalty function is slightly different from the above equation, the dimension of the state to be solved remains unchanged because the pose of the loop frame is considered to be constant. When a plurality of loops are established in the current sliding window, all loop features from all frames are simultaneously used for optimization to form a multi-view constraint of relocation, so that the positioning accuracy and smoothness are improved.
The global pose graph is optimized, and after relocation, a local sliding window is moved and aligned with a past position. With the relocation result, registration of the past set of poses into a globally consistent configuration is ensured by global pose graph optimization. The cumulative drift occurs in only four degrees of freedom (x, y, z and yaw) due to the visual inertial settings making roll and pitch angles fully observable. For this reason, only optimization of the 4dof attitude map was performed, ignoring the estimation of the drift-free roll and pitch states. When the key frame is rimmed from the sliding window, adding the rimmed key frame into the pose graph, and taking the rimmed key frame as a new vertex in the pose graph, and connecting the new vertex with other vertexes in the pose graph through a sequence edge and/or a loop edge: the newly rimmed key frames will establish several sequence edges to associate themselves with the existing key frames, the sequence edges representing the relative transition between two key frames in the local sliding window, the values of which are taken directly from the VIO. E.g., a newly rimmed key frame i and its previous key frame j, the sequence edge contains only the relative position
Figure BDA0003476374720000433
And yaw angle
Figure BDA0003476374720000434
Figure BDA0003476374720000435
Figure BDA0003476374720000436
If the newly rimmed key frame has a loop connected with the existing key frame, the loop edge in the pose graph is connected with the loop frame, the loop edge only contains 4-degree-of-freedom relative pose transformation, and the value of the loop edge is obtained by using the repositioning result.
For 4-dof pose graph optimization, the residual of the edge between frames i and j is defined as:
Figure BDA0003476374720000441
wherein,
Figure BDA0003476374720000442
are roll and pitch angle estimates derived directly from monocular VIO. The entire graph of sequence and loop edges is optimized by minimizing the following cost function:
Figure BDA0003476374720000443
where S is the set of all sequence edges and l is the set of all loop edges, although tightly coupled relocation is advantageous for eliminating false loops, the present invention additionally introduces the Huber specification ρ (·) to further reduce the impact of false loops. The pose graph optimization and the repositioning are operated independently, so the repositioning can be performed when the pose graph is optimized, and the repositioning can be performed by using the existing pose graph when the pose graph optimization is not completed.
For pose graph management, as travel distance increases, the pose graph may grow indefinitely, affecting the long-term performance of the system. Therefore, the invention adopts a down-sampling process to maintain the size of the pose graph database, that is, only all key frames with loop constraints are retained, while other key frames too similar may be deleted, and the probability of deleting the key frames with greater similarity is higher.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method for tracking the mixed position and pose of vision and inertia of a head-wearing augmented reality system is characterized by firstly carrying out equipment initialization, comprising the steps of carrying out measurement preprocessing on a vision sensor and an inertia measuring device, carrying out combined calibration on the vision sensor and the inertia measuring device, then carrying out fusion positioning on the head-wearing augmented reality system by using measurement data acquired by the vision sensor and the inertia measuring device, carrying out loop detection and global position and pose diagram optimization on a positioning result, and obtaining a final position and pose estimation result.
2. The method for tracking mixed pose of vision and inertia of head-mounted augmented reality system according to claim 1, wherein the measurement preprocessing comprises processing frame images acquired by the vision sensor and performing pre-integration on the corresponding inertial measurement results of two consecutive frame images.
3. The method for tracking the mixed pose of vision and inertia of the head-mounted augmented reality system according to claim 1, wherein the fusion positioning process fuses the pre-integrated inertia measurement result and the visual image feature observation result to realize the fusion positioning of the head-mounted augmented reality system.
4. The method for tracking mixed pose of vision and inertia of head-mounted augmented reality system of claim 1, wherein the global pose graph is optimized to eliminate drift by global optimization using repositioning results.
5. The method for tracking mixed pose of vision and inertia of head-mounted augmented reality system of claim 1, wherein the device initialization process is performed by initializing visual SFM algorithm based on sliding window, calculating pose change of the head-mounted augmented reality system, and restoring scale, gravity, speed and deviation by matching IMU pre-integration result with visual SFM algorithm result for guiding VIO based on non-linear optimization.
6. The visual and inertial mixed pose tracking method of the head-mounted augmented reality system according to claim 5, wherein the sliding window-based visual SFM algorithm is initialized, and comprises feature point tracking, camera pose estimation and optimization, for the feature point tracking, firstly, the feature correspondence between the latest frame image of the images collected by the head-mounted augmented reality system and all frame images in the sliding window is checked, if the latest frame image and any frame image in the sliding window have more than a certain number of tracking features and rotation compensation pixels, the latest frame image is subjected to relative rotation and scale translation by using the tracking features and the rotation compensation pixels, all features in the two frames are triangulated, and if the latest frame image does not exist, the latest frame image is stored in the sliding window, and a new frame image is waited; for camera pose estimation, estimating the poses of all frame images in a sliding window by utilizing the triangulated features and adopting a PnP method to further obtain the camera poses; the optimization is carried out, wherein the total weight projection error observed by all the characteristics is minimized by using a BA optimization method, so that the optimization estimation of the camera attitude is completed; p is the position coordinates of the real object in the real space corresponding to the feature points of the left and right frame images respectively acquired by the left eye camera and the right eye camera, and is called as the feature position, and the feature point P1、p2The projection of the real object on the left eye camera and the right eye camera respectively, and O1 and O2 are the centers of the left eye camera and the right eye camera respectively; p is represented by the formula P ═ X, Y, Z]TAccording to the pinhole camera model, then the feature point p1、p2Satisfies the relationship:
s1p1=K1P
s2p2=K2(RP+t),
wherein, K1、K2Reference matrixes of the left eye camera and the right eye camera respectively, R, t rigid body transformation moment between the left eye camera and the right eye camera respectivelyArray sum bias matrix, s1、s2The least square solution of P is calculated for the proportionality coefficients of the left eye camera and the right eye camera respectively as follows:
Figure FDA0003476374710000021
thereby obtaining a characteristic position estimation result; and according to the feature point matching information and the feature position estimation, acquiring the three-dimensional space position of the current feature point and the projection position of the current feature point on the camera by utilizing an SFM algorithm, and then estimating the image posture by directly utilizing a PnP algorithm.
7. The method for tracking mixed pose of vision and inertia of head-mounted augmented reality system according to claim 1, wherein the vision sensor and the inertial measurement unit are calibrated jointly, and the inertial measurement unit is calibrated to obtain the initial estimation result of the target pose, velocity, gravity vector, gyroscope bias and three-dimensional feature position, which is used for guiding the vision-inertial odometer VIO based on nonlinear optimization to calculate the absolute scale factor, gyroscope bias, acceleration bias, gravity acceleration and relative velocity of the inertial measurement unit to the vision sensor at each measurement time.
8. The visual and inertial hybrid pose tracking method of a head-mounted augmented reality system of claim 6,
the calibration of the inertial measurement equipment comprises the calibration of the bias of a gyroscope, the optimization of gravity vectors, the calibration of speed, gravity vectors and scales; the gyroscope bias is calibrated by considering two continuous frame images b in the image windowkAnd bk+1And obtaining the corresponding rotation compensation pixel by using visual SFM algorithm
Figure FDA0003476374710000031
And
Figure FDA0003476374710000032
will frame image bk+1Time-to-frame image bkThe offset of the inertial measurement at that time is recorded as
Figure FDA0003476374710000033
Recording the result of the pre-integration of the corresponding inertial measurement results as
Figure FDA0003476374710000034
Linearizing a pre-integral term about the gyroscope bias in the inertial measurement equipment, and constructing a minimization loss function as follows:
Figure FDA0003476374710000035
Figure FDA0003476374710000036
b, collecting the serial numbers of all the frame images in the image window; using biased jacobian matrices
Figure FDA0003476374710000037
To obtain
Figure FDA0003476374710000038
Solving the minimum loss function to obtain the gyroscope bias b by the first-order approximation of the gyroscope bias and delta as a calibration coefficientωInitial calibration of (3);
after the gyro is biased and initialized, the speed, the gravity vector and the actual scale are continuously calculated, and the quantity to be calculated is expressed as:
Figure FDA0003476374710000039
wherein,
Figure FDA00034763747100000310
is the velocity of the target in the body coordinate system when the image of the k-th frame is acquired,
Figure FDA00034763747100000311
is to obtain the c0The gravity vector measured during the frame of the image, s is the scaling factor for scaling the visual SFM algorithm to metric units.
9. The method for tracking mixed pose of vision and inertia of head-mounted augmented reality system of claim 1, wherein the vision sensor and the inertial measurement device are calibrated jointly, CobjRepresenting an object coordinate system, and regarding the object coordinate system as a world coordinate system; ccRepresenting a vision sensor coordinate system; ciRepresenting an inertial measurement unit coordinate system; c can be obtained by a visual sensor calibration methodcAnd CobjRelative position of (2), here by RaAnd taRepresenting the pose and 3 translational fundamental components of the vision sensor relative to the world coordinate system, through CiAnd CcRelative position t of the middlexObtaining CiAnd CobjRelative position R ofx(ii) a The calibration of the vision sensor and inertial measurement unit is carried out by observing a known calibration reference in space at different positions to derive RxAnd txThe relationship of multiple observations;
the method is used for jointly calibrating a visual sensor and an inertial measurement unit, and needs to obtain multi-dimensional parameter information from an inertial measurement unit, and the implementation process comprises the following steps: firstly, directly acquiring 3D information of the three-dimensional object, namely directly acquiring a set of target attitude angles represented by three Euler angles from an inertial measurement device, and angular velocity and linear acceleration of raw data, so as to acquire required target attitude time txThen, assume the target initial position is zero, i.e. the initial position s00, target position s after time Deltat of initial velocity v1By a linear formula s1=vΔt+1/2·aΔt2The estimation is carried out in such a way that,and a is target acceleration, and the multidimensional parameter information comprises a target attitude angle, an original data angular velocity, linear acceleration, target attitude time and a target position.
10. The method for tracking the mixed pose of vision and inertia of the head-mounted augmented reality system according to claim 1, wherein the fusion positioning of the head-mounted augmented reality system by using the measurement data collected by the vision sensor and the inertia measurement device comprises measurement preprocessing, fusion of the vision and inertia data, repositioning and global attitude map optimization.
CN202210055655.3A 2022-01-18 2022-01-18 Visual and inertial mixed pose tracking method of head-mounted augmented reality system Pending CN114485637A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210055655.3A CN114485637A (en) 2022-01-18 2022-01-18 Visual and inertial mixed pose tracking method of head-mounted augmented reality system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210055655.3A CN114485637A (en) 2022-01-18 2022-01-18 Visual and inertial mixed pose tracking method of head-mounted augmented reality system

Publications (1)

Publication Number Publication Date
CN114485637A true CN114485637A (en) 2022-05-13

Family

ID=81512598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210055655.3A Pending CN114485637A (en) 2022-01-18 2022-01-18 Visual and inertial mixed pose tracking method of head-mounted augmented reality system

Country Status (1)

Country Link
CN (1) CN114485637A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860079A (en) * 2022-05-17 2022-08-05 东南大学 Dynamic robust head-mounted augmented reality device registration method
CN115342829A (en) * 2022-07-20 2022-11-15 中国科学院自动化研究所 Mileage calculation method and device based on iterative extended Kalman filter
CN116205947A (en) * 2023-01-03 2023-06-02 哈尔滨工业大学 Binocular-inertial fusion pose estimation method based on camera motion state, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102890781A (en) * 2012-07-04 2013-01-23 北京航空航天大学 Method for identifying wonderful shots as to badminton game video
CN104239411A (en) * 2014-08-12 2014-12-24 中国科学技术大学 Color and position clustering and angular point detection-based detection method for grid-shaped radar
CN108732570A (en) * 2017-04-20 2018-11-02 深圳市气象局 The nowcasting method of disastrous convection weather based on particle filter blending algorithm
CN109993113A (en) * 2019-03-29 2019-07-09 东北大学 A kind of position and orientation estimation method based on the fusion of RGB-D and IMU information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102890781A (en) * 2012-07-04 2013-01-23 北京航空航天大学 Method for identifying wonderful shots as to badminton game video
CN104239411A (en) * 2014-08-12 2014-12-24 中国科学技术大学 Color and position clustering and angular point detection-based detection method for grid-shaped radar
CN108732570A (en) * 2017-04-20 2018-11-02 深圳市气象局 The nowcasting method of disastrous convection weather based on particle filter blending algorithm
CN109993113A (en) * 2019-03-29 2019-07-09 东北大学 A kind of position and orientation estimation method based on the fusion of RGB-D and IMU information

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
MARTINEZ-DIAZ, SAUL: "3D Distance Measurement from a Camera to a Mobile Vehicle, Using Monocular Vision", 《JOURNAL OF SENSORS》, vol. 2021, pages 1 - 8 *
QIN TONG ET AL.: "VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator", 《IEEE TRANSACTIONS ON ROBOTICS》, vol. 34, no. 4, pages 1004 - 1020, XP011688854, DOI: 10.1109/TRO.2018.2853729 *
严鹏: "基于IMU和视觉融合的机器人定位技术研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, pages 138 - 2535 *
何宁等: "增强现实航天飞行训练***空间定位", 《航天医学与医学工程》, vol. 31, no. 2, pages 255 - 260 *
朱庄生;袁学忠;: "基于单目视觉的位置姿态测量***精度检校方法", 中国惯性技术学报, no. 05, pages 100 - 105 *
龚赵慧;张霄力;彭侠夫;李鑫;: "基于视觉惯性融合的半直接单目视觉里程计", 机器人, no. 05, pages 85 - 95 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860079A (en) * 2022-05-17 2022-08-05 东南大学 Dynamic robust head-mounted augmented reality device registration method
CN115342829A (en) * 2022-07-20 2022-11-15 中国科学院自动化研究所 Mileage calculation method and device based on iterative extended Kalman filter
CN116205947A (en) * 2023-01-03 2023-06-02 哈尔滨工业大学 Binocular-inertial fusion pose estimation method based on camera motion state, electronic equipment and storage medium
CN116205947B (en) * 2023-01-03 2024-06-07 哈尔滨工业大学 Binocular-inertial fusion pose estimation method based on camera motion state, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109307508B (en) Panoramic inertial navigation SLAM method based on multiple key frames
CN109974693B (en) Unmanned aerial vehicle positioning method and device, computer equipment and storage medium
CN107341814B (en) Four-rotor unmanned aerial vehicle monocular vision range measurement method based on sparse direct method
CN106446815B (en) A kind of simultaneous localization and mapping method
KR102427921B1 (en) Apparatus and method for real-time mapping and localization
CN114485637A (en) Visual and inertial mixed pose tracking method of head-mounted augmented reality system
WO2018129715A1 (en) Simultaneous positioning and dense three-dimensional reconstruction method
US7212228B2 (en) Automatic camera calibration method
CN110726406A (en) Improved nonlinear optimization monocular inertial navigation SLAM method
CN108615246B (en) Method for improving robustness of visual odometer system and reducing calculation consumption of algorithm
CN112219087A (en) Pose prediction method, map construction method, movable platform and storage medium
CN110260861B (en) Pose determination method and device and odometer
CN112815939B (en) Pose estimation method of mobile robot and computer readable storage medium
JP2021518622A (en) Self-location estimation, mapping, and network training
CN111862316B (en) Three-dimensional reconstruction method of dense direct RGBD (Red Green blue-white) of tight coupling of IMU (inertial measurement Unit) based on optimization
CN111623773B (en) Target positioning method and device based on fisheye vision and inertial measurement
CN112115980A (en) Binocular vision odometer design method based on optical flow tracking and point line feature matching
Seiskari et al. HybVIO: Pushing the limits of real-time visual-inertial odometry
CN111998862A (en) Dense binocular SLAM method based on BNN
CN114529576A (en) RGBD and IMU hybrid tracking registration method based on sliding window optimization
CN116007609A (en) Positioning method and computing system for fusion of multispectral image and inertial navigation
CN116772844A (en) Navigation method of visual inertial indoor robot based on dynamic environment
Liu et al. Real-time dense construction with deep multi-view stereo using camera and IMU sensors
CN112991400B (en) Multi-sensor auxiliary positioning method for unmanned ship
Zhu et al. PairCon-SLAM: Distributed, online, and real-time RGBD-SLAM in large scenarios

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination