CN113188557A - Visual inertial integrated navigation method fusing semantic features - Google Patents

Visual inertial integrated navigation method fusing semantic features Download PDF

Info

Publication number
CN113188557A
CN113188557A CN202110467584.3A CN202110467584A CN113188557A CN 113188557 A CN113188557 A CN 113188557A CN 202110467584 A CN202110467584 A CN 202110467584A CN 113188557 A CN113188557 A CN 113188557A
Authority
CN
China
Prior art keywords
semantic
plane
visual
coordinate system
pose
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110467584.3A
Other languages
Chinese (zh)
Other versions
CN113188557B (en
Inventor
黄郑
王红星
雍成优
朱洁
刘斌
吕品
陈玉权
何容
吴媚
赖际舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Jiangsu Fangtian Power Technology Co Ltd
Original Assignee
Nanjing University of Aeronautics and Astronautics
Jiangsu Fangtian Power Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics, Jiangsu Fangtian Power Technology Co Ltd filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202110467584.3A priority Critical patent/CN113188557B/en
Publication of CN113188557A publication Critical patent/CN113188557A/en
Application granted granted Critical
Publication of CN113188557B publication Critical patent/CN113188557B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • G01C21/3407Route searching; Route guidance specially adapted for specific applications
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C22/00Measuring distance traversed on the ground by vehicles, persons, animals or other moving solid bodies, e.g. using odometers, using pedometers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S11/00Systems for determining distance or velocity not using reflection or reradiation
    • G01S11/12Systems for determining distance or velocity not using reflection or reradiation using electromagnetic waves other than radio waves
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Electromagnetism (AREA)
  • Navigation (AREA)

Abstract

The invention discloses a visual inertial integrated navigation method fused with semantic features, which collects RGBD visual sensor data S (k) and accelerometer data at the time k
Figure DDA0003043820970000011
And gyroscope data
Figure DDA0003043820970000012
Calculating by using a visual odometer according to the visual sensor data S (k) to obtain the current pose T (k) of the camera; semantic plane feature extraction and matching between two adjacent image frames are carried out by using visual sensor data S (k); phasing using inertial sensor dataPre-integration between two adjacent image frames; optimally solving carrier navigation information by combining a semantic plane observation residual error, a visual odometer relative pose observation residual error and an inertia pre-integration residual error; and outputting carrier navigation information and camera internal parameters. The invention can effectively improve the positioning accuracy and robustness of the navigation system.

Description

Visual inertial integrated navigation method fusing semantic features
Technical Field
The invention belongs to the technical field of robot navigation, and particularly relates to a visual inertial integrated navigation method fusing semantic features.
Background
The visual SLAM algorithm becomes a great research hotspot in the field of autonomous navigation of robots due to the richness of perception information. The traditional visual SLAM method extracts the characteristics of points, lines and the like to carry out the characteristic description and pose calculation of the environment, the characteristics are described and matched through the bottom layer brightness relationship, and the structural redundancy information in the scene is not fully utilized. The single vision sensor is difficult to meet the robust positioning of the unmanned aerial vehicle in an indoor complex environment due to limited perception dimension. These all have an impact on the accuracy and reliability of the navigation system.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a visual inertial integrated navigation method fusing semantic features so as to improve the autonomous positioning precision of an unmanned aerial vehicle.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a visual inertial integrated navigation method fusing semantic features comprises the following steps:
step 1, collecting RGBD vision sensor data S (k) and accelerometer data at k moment
Figure BDA0003043820950000011
And gyroscope data
Figure BDA0003043820950000012
Step 2, resolving by using a visual odometer according to visual sensor data S (k) to obtain a current pose T (k) of the camera;
step 3, constructing a semantic plane feature map based on visual sensor data between two adjacent image frames, matching a semantic plane of the current frame with a semantic landmark in a map, and obtaining an observation relation between the current key frame and the semantic landmark;
step 4, performing inertia pre-integration between two adjacent image frames based on inertial sensor data, wherein the inertial sensor data comprises accelerometer data and gyroscope data;
and 5, performing nonlinear optimization on the optimization function to realize pose solution based on the sum of the semantic plane observation residual error, the visual odometer relative pose observation residual error and the inertia pre-integration residual error as a combined optimization function.
Step 6, outputting carrier navigation information and camera internal parameters, and returning to the step 1
Preferably, the acquiring of the current pose of the camera specifically includes: firstly, ORB features are extracted from two adjacent frames, then the relative pose between the two key frames is calculated by utilizing the ORB feature matching relationship between the two frames and pnp, and the pose T (k) of the camera at the moment k is obtained by accumulating the relative poses.
Preferably, plane information is acquired based on visual sensor data between two adjacent image frames, and semantic plane features are constructed based on semantic categories, centroids, normal directions and horizontal and vertical types of the planes:
·sp={px,py,pz}
·sn={nx,ny,nz,nd}
·
Figure BDA0003043820950000021
·sccorresponding detected semantic object classes
wherein ,spIs the plane centroid, snIs a plane normal parameter, soAs plane class labels (horizontal/vertical), scA plane semantic class label that depends on the semantic object class to which the plane corresponds;
constructing a semantic plane feature map based on the constructed semantic plane features;
defining initial semantic road sign, detecting semantic plane S for each framekAnd respectively carrying out matching judgment on the category, the normal direction and the centroid of the initial semantic road sign to obtain the observation relation between the current key frame and the semantic road sign.
Preferably, the inertial sensor data obtained at time k includes accelerometer data from time k-1 to time k
Figure BDA0003043820950000031
And gyroscope data
Figure BDA0003043820950000032
For constructing an inertial sensor measurement model:
Figure BDA0003043820950000033
Figure BDA0003043820950000034
wherein ,na and nωAre respectively added withSpeedometer and gyroscope white noise;
Figure BDA0003043820950000035
and
Figure BDA0003043820950000036
random walk of accelerometer and gyroscope, its derivative is white noise;
Figure BDA0003043820950000037
is an ideal value measured by the accelerometer,
Figure BDA0003043820950000038
is an ideal value measured by a gyroscope; gWIs the gravity under the navigation system;
Figure BDA0003043820950000039
navigating a rotation matrix from a coordinate system to a body coordinate system at a sampling moment;
a plurality of inertial sensor data are arranged between two adjacent image frames, and all the inertial sensor data between the two adjacent image frames are subjected to pre-integration in an iteration mode:
Figure BDA00030438209500000310
Figure BDA00030438209500000311
Figure BDA00030438209500000312
wherein ,
Figure BDA00030438209500000313
in order to pre-integrate the position,
Figure BDA00030438209500000314
in order to pre-integrate the speed,the rotation is represented by a quaternion y,
Figure BDA00030438209500000315
pre-integration for rotation; at the beginning of the process, the process is carried out,
Figure BDA00030438209500000316
and
Figure BDA00030438209500000317
is a non-volatile organic compound (I) with a value of 0,
Figure BDA00030438209500000318
is unit quaternion, R (gamma) represents the conversion of quaternion into rotation matrix,
Figure BDA00030438209500000319
a multiplication operation representing a quaternion;
the frequency of the visual data is consistent with the inertia pre-integral frequency to obtain the inertia pre-integral between two adjacent image frames
Figure BDA00030438209500000320
And
Figure BDA00030438209500000321
preferably, the optimization function is:
Figure BDA00030438209500000322
wherein the Semantic part represents the residual of the Semantic roadmap observation,
Figure BDA0003043820950000041
representing a semantic landmark observation error observed by a certain frame of a camera, wherein the VO part represents an observation residual error of a relative pose in a visual odometer, the third term is an IMU pre-integration estimation residual error, rho (·) represents a robust kernel function, and Σ-1An information matrix representing the correspondence of each error amount, which is the inverse of the covariance, represents the pre-estimation of each constraint accuracy,i. j refers to the ith and jth image frames, respectively, and k refers to the kth semantic landmark.
Preferably, the semantic roadmap observation error
Figure BDA0003043820950000042
Figure BDA0003043820950000043
wherein ,
Figure BDA0003043820950000044
for a position observation of a semantic landmark in the camera coordinate system,
Figure BDA0003043820950000045
and
Figure BDA0003043820950000046
respectively representing the pose of the camera at the ith frame and the landmark L for variables to be optimizedkA location in the world system;
calculate e about
Figure BDA0003043820950000047
And
Figure BDA0003043820950000048
a Jacobian matrix of errors;
the pose observation residual
Figure BDA0003043820950000049
Figure BDA00030438209500000410
wherein ,
Figure BDA00030438209500000411
for the observation of the relative pose of the VO at the ith frame and the jth frame,
Figure BDA00030438209500000412
and
Figure BDA00030438209500000413
is a variable to be optimized;
respectively solve the e-related pose
Figure BDA00030438209500000414
A Jacobian matrix of errors;
calculating inertial pre-integral residual
Figure BDA00030438209500000415
The difference between the predicted value and the pre-integration value between two adjacent image frames is obtained.
Preferably, e in the semantic roadmap observation error is about
Figure BDA00030438209500000416
And
Figure BDA00030438209500000417
the Jacobian matrix of errors is:
Figure BDA0003043820950000051
wherein xi is the pose
Figure BDA0003043820950000052
Lie algebraic representation of [ X ]i′ Yi′ Zi′]TIs the semantic plane coordinate of i moment camera system, fx、fyIs the focal length in the camera's internal parameters,
Figure BDA0003043820950000053
is composed of
Figure BDA0003043820950000054
Of the rotating component of (1).
Preferably, the position viewMeasuring e-relative pose in residual error
Figure BDA0003043820950000055
The Jacobian matrix of errors is:
Figure BDA0003043820950000056
preferably, the inertial pre-integration residual error formula is:
Figure BDA0003043820950000057
wherein ,
Figure BDA0003043820950000058
a rotation matrix from a j moment navigation coordinate system to a body coordinate system;
Figure BDA0003043820950000059
Figure BDA00030438209500000510
respectively being the positions of the organism coordinate system at the time i and the time j under the navigation coordinate system;
Figure BDA00030438209500000511
the speed of the organism coordinate system at the time i and the speed of the organism coordinate system at the time j are respectively under the navigation coordinate system; Δ tiThe time interval between two adjacent image frames;
Figure BDA00030438209500000512
quaternions of the rotation of the organism coordinate system at the time i and the time j in the navigation coordinate system are respectively;
Figure BDA00030438209500000513
respectively moving the accelerometers in the machine body coordinate system at the time i and the time j randomly;
Figure BDA00030438209500000514
the gyroscope respectively moves randomly at i moment and j moment under the coordinate system of the body, [ gamma ]]xyzRepresenting the x, y, z components taking the quaternion y.
The invention discloses the following technical effects: when the camera is more mobile, the navigation positioning accuracy is low by only depending on the visual features for motion estimation. According to the invention, semantic features and inertia information are fused into an optimization objective function of the visual odometer, high-dimensional feature observation and extra sensor observation constraint are added for the visual odometer, relative motion in a short time is estimated, and pose optimization is carried out, so that the pose estimation precision of the visual odometer under the condition of strong mobility can be improved, and the navigation positioning precision and robustness of the visual odometer can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in fig. 1, the invention provides a visual inertial integrated navigation method with semantic features fused, comprising the following steps:
step 1: collecting RGBD vision sensor data S (k) and accelerometer data at k moment
Figure BDA0003043820950000071
And gyroscope data
Figure BDA0003043820950000072
Step 2: and (5) calculating to obtain the current pose T (k) of the camera by using a visual odometer according to the visual sensor data S (k):
firstly, ORB features of two adjacent frames are extracted, then the relative pose between two key frames is calculated by utilizing the ORB feature matching relationship between the two frames and the Perspective-n-point (pnp), and the pose T (k) of the camera at the moment k can be obtained by accumulating the relative poses.
And step 3: semantic plane feature extraction and matching between two adjacent image frames are carried out based on the vision sensor data S (k):
data S (k) and S (k-1) are acquired by using RGBD visual sensors at the time k and the time k-1, and the corresponding semantic plane feature extraction and matching method comprises the following steps:
(a) extracting semantic information
The detection of the set semantic object is realized by using a YOLOv3 target detection algorithm, and the two-dimensional frame boundary bbox of the semantic object in the image frame is obtained.
(b) Semantic plane extraction
In the bbox region, a depth image in the visual sensor data S (k) is converted into a structured point cloud format, normal vector extraction of the point cloud is carried out by utilizing neighborhood covariance according to a structured point cloud plane segmentation algorithm based on a connected domain, and then whether the adjacent point clouds belong to the same connected domain or not is judged based on the point cloud normal vector and the distance between a tangent plane and an origin.
Using least square method to make the number of point clouds greater than a set threshold value NminAnd carrying out plane fitting on the connected domain to obtain a plane equation of the connected domain. Therefore, the parameters (centroid { x, y, z }, normal direction n) of the plane where the semantic object surface is located can be obtainedp:{nx,ny,nz}) and the object category to which the semantic plane belongs, so as to realize the extraction of the semantic plane.
(c) Semantic plane feature construction is carried out based on the extracted plane information
First using the plane normal npAnd a priori ground plane normal ngAnd (3) judging the horizontal and vertical types of the planes: calculating the normal n of point cloud and the normal n of prior ground planegThe difference of (a): dhor=||np-ngIf n | |pAnd ngDifference d ofhorLess than a set level normal difference threshold thorThen n is usedpThe plane being the normal is marked as the horizontal plane. Calculating the difference value of the plane normal and the vertical plane normal by means of dot product: dvert=np·ngIf d isvertLess than a set threshold tvertIn n ispThe plane that is normal is marked as the vertical plane.
Then, semantic plane features are constructed by utilizing semantic categories, centroids, normal directions and horizontal and vertical types of planes:
·sp={px,py,pz}
·sn={nx,ny,nz,nd}
·
Figure BDA0003043820950000081
·sccorresponding detected semantic object classes
wherein ,spIs a plane centroid, snIs a plane normal parameter, soAs plane class labels (horizontal/vertical), scIs a plane semantic class label which depends on the semantic object class corresponding to the plane.
(d) And constructing a semantic plane feature map based on the semantic plane features, and matching the semantic plane of the current frame with the semantic road signs in the map.
Based on the semantic plane obtained in step (b):
Figure BDA0003043820950000091
wherein ,SiA semantic plane is represented that represents the semantic plane,
Figure BDA0003043820950000092
a semantic plane centroid is represented which is,
Figure BDA0003043820950000093
a semantic plane normal parameter is represented that,
Figure BDA0003043820950000094
represents semantic plane class labels (horizontal/vertical),
Figure BDA0003043820950000095
representing flat semantic category labels.
Directly mapping the first received semantic plane to a first semantic landmark, denoted Li:
Figure BDA0003043820950000096
wherein
Figure BDA0003043820950000097
And
Figure BDA0003043820950000098
respectively representing the semantic landmark centroid and the normal under the world coordinate system. Utilization thereof
Figure BDA0003043820950000099
And
Figure BDA00030438209500000910
the coordinate transformation is used for solving the problem, and the specific formula is as follows:
Figure BDA00030438209500000911
Figure BDA00030438209500000912
wherein xrIndicating the current camera position in the world coordinate system,
Figure BDA00030438209500000913
representing a rotation matrix from the camera coordinate system to the world coordinate system.
After the initial semantic road sign setting is finished, the semantic plane S is detected by each frame through the following three stepskData association with semantic roadmapping:
firstly, matching semantic planes extracted from frames with semantic road signs according to categories and plane types, calculating the point cloud number and the area of the semantic planes after successful matching, and deleting the point cloud number which is smaller than a minimum threshold value tpOr the plane area is less than the minimum threshold taThe semantic plane of (2);
converting the normal direction of the semantic plane from a camera coordinate system to a world coordinate system by using coordinate transformation to obtain
Figure BDA00030438209500000914
Normal to semantic road sign
Figure BDA00030438209500000915
And (6) matching. If it is
Figure BDA00030438209500000916
And
Figure BDA00030438209500000917
the deviation between is less than the set threshold value tnIf the semantic plane is consistent with the normal direction of the road sign, entering subsequent coordinate association, otherwise, failing to match;
converting the centroid of the semantic plane from the camera coordinate system to the world coordinate system, and calculating the landmark centroid matched with the centroid of the semantic planeAnd if the Mahalanobis distance is greater than a set threshold value, mapping the detected semantic object into a new semantic landmark ljOtherwise, the semantic object and the current semantic road sign liMatching;
by the steps, the observation relation between the current key frame and the semantic road sign can be obtained.
And 4, step 4: pre-integration between two adjacent image frames based on inertial sensor data:
the inertial sensor data obtained at time k
Figure BDA0003043820950000101
And
Figure BDA0003043820950000102
including accelerometer data from time k-1 to time k
Figure BDA0003043820950000103
And gyroscope data
Figure BDA0003043820950000104
Wherein i is 0,1,2, …, (t (k) -t (k-1))/Δ t, t (k) is a sampling time corresponding to the time k, t (k-1) is a sampling time corresponding to the time k-1, Δ t is a sampling period of the inertial sensor, and the inertial sensor measurement model is as follows:
Figure BDA0003043820950000105
Figure BDA0003043820950000106
wherein ,na and nωAccelerometer and gyroscope white noise, respectively;
Figure BDA00030438209500001019
and
Figure BDA00030438209500001020
random walk of accelerometer and gyroscope, its derivative is white noise;
Figure BDA0003043820950000107
is an ideal value measured by the accelerometer,
Figure BDA0003043820950000108
is an ideal value measured by a gyroscope; gWIs the gravity under the navigation system;
Figure BDA0003043820950000109
navigating a rotation matrix from a coordinate system to a body coordinate system at a sampling moment;
pre-integration process between two adjacent inertia frames:
Figure BDA00030438209500001010
Figure BDA00030438209500001011
Figure BDA00030438209500001012
wherein ,
Figure BDA00030438209500001013
in order to pre-integrate the position,
Figure BDA00030438209500001014
for speed pre-integration, the rotation is represented by a quaternion y,
Figure BDA00030438209500001015
pre-integration for rotation; at the beginning of the process, the process is carried out,
Figure BDA00030438209500001016
and
Figure BDA00030438209500001017
is a non-volatile organic compound (I) with a value of 0,
Figure BDA00030438209500001018
is unit quaternion, R (gamma) represents the conversion of quaternion into rotation matrix,
Figure BDA0003043820950000111
a multiplication operation representing a quaternion; a plurality of inertial sensor data are arranged between two adjacent image frames, and all the inertial sensor data between the two adjacent image frames are subjected to pre-integration in an iteration mode through the formula, so that the frequency of the visual data is consistent with the inertial pre-integration frequency, and the inertial pre-integration between the two adjacent image frames is obtained
Figure BDA0003043820950000112
And
Figure BDA0003043820950000113
and 5: optimally solving carrier navigation information by combining a semantic plane observation residual error, a visual odometer relative pose observation residual error and an inertia pre-integration residual error:
a) establishing an optimized variable X:
Figure BDA0003043820950000114
wherein ,
Figure BDA0003043820950000115
n is the sequence number of the last frame,
Figure BDA0003043820950000116
Figure BDA0003043820950000117
and
Figure BDA0003043820950000118
respectively representing the carrier position and speed of the k-th frameRandom walk of degrees, quaternions, accelerometers, and gyroscopes;
Figure BDA0003043820950000119
the coordinates of the semantic plane features are used, and m is the serial number of the last semantic plane feature;
(b) establishing an optimization function e:
Figure BDA00030438209500001110
wherein the Semantic part represents the residual of the Semantic roadmap observation,
Figure BDA00030438209500001111
representing the semantic landmark position error observed by a certain frame of the camera. The VO part represents the observed residual error of the relative pose in the visual odometer. The third term is the IMU pre-integration estimate residual. P (-) represents a robust kernel function, Σ-1An information matrix representing the error amount is an inverse of the covariance and represents the pre-estimation of each constraint accuracy. Wherein i and j refer to the ith and jth image frames respectively, and k refers to the kth semantic landmark.
For semantic roadmap observation errors:
Figure BDA0003043820950000121
wherein ,
Figure BDA0003043820950000122
for a position observation of a semantic landmark in the camera coordinate system,
Figure BDA0003043820950000123
and
Figure BDA0003043820950000124
respectively representing the pose of the camera at the ith frame and the landmark L for variables to be optimizedkIn the world system. e about
Figure BDA0003043820950000125
And
Figure BDA0003043820950000126
the Jacobian matrix of errors is as follows:
Figure BDA0003043820950000127
Figure BDA0003043820950000128
wherein xi is the pose
Figure BDA0003043820950000129
Lie algebraic representation of [ X ]i′ Yi′ Zi′]TIs the semantic plane coordinate of i moment camera system, fx、fyIs the focal length in the camera's internal parameters,
Figure BDA00030438209500001210
is composed of
Figure BDA00030438209500001211
Of the rotating component of (1).
For VO pose observation residuals:
Figure BDA00030438209500001212
wherein ,
Figure BDA00030438209500001213
and observing the relative pose of the VO at the ith frame and the jth frame.
Figure BDA00030438209500001214
And
Figure BDA00030438209500001215
are variables to be optimized. By utilizing the lie algebra right disturbance model and the adjoint property thereof, the pose of e can be solved respectively
Figure BDA00030438209500001216
The Jacobian matrix of errors is specified as follows:
Figure BDA00030438209500001217
Figure BDA00030438209500001218
for the inertial pre-integration residual, it is obtained from the difference between the predicted value and the pre-integration value between two adjacent image frames:
Figure BDA0003043820950000131
in the above formula, the first and second carbon atoms are,
Figure BDA0003043820950000132
a rotation matrix from a j moment navigation coordinate system to a body coordinate system;
Figure BDA0003043820950000133
Figure BDA0003043820950000134
respectively being the positions of the organism coordinate system at the time i and the time j under the navigation coordinate system;
Figure BDA0003043820950000135
the speed of the organism coordinate system at the time i and the speed of the organism coordinate system at the time j are respectively under the navigation coordinate system; Δ tiThe time interval between two adjacent image frames;
Figure BDA0003043820950000136
quaternion of the rotation of the body coordinate system under the navigation coordinate system at the time i and the time j respectively;
Figure BDA0003043820950000137
Respectively moving the accelerometers in the machine body coordinate system at the time i and the time j randomly;
Figure BDA0003043820950000138
the gyroscope respectively moves randomly at i moment and j moment under the coordinate system of the body, [ gamma ]]xyzRepresenting the x, y, z components of a quaternion gamma;
(c) and (4) carrying out iterative solution on the optimization function by using a Levenberg-Marquardt algorithm, and stopping iteration when error convergence or a set maximum iteration number is reached to obtain carrier navigation information.
Step 6: and outputting carrier navigation information and camera internal parameters, and returning to the step 1.
When the camera is more mobile, the navigation positioning accuracy is low by only depending on the visual features for motion estimation. According to the invention, semantic features and inertia information are fused into an optimization objective function of the visual odometer, high-dimensional feature observation and extra sensor observation constraint are added for the visual odometer, relative motion in a short time is estimated, and pose optimization is carried out, so that the pose estimation precision of the visual odometer under the condition of strong mobility can be improved, and the navigation positioning precision and robustness of the visual odometer can be improved.
The above-described embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solutions of the present invention can be made by those skilled in the art without departing from the spirit of the present invention, and the technical solutions of the present invention are within the scope of the present invention defined by the claims.

Claims (9)

1. A visual inertial integrated navigation method fused with semantic features is characterized by comprising the following steps:
step 1, collecting RGBD vision sensor data S (k) and accelerometer data at k moment
Figure FDA0003043820940000011
And gyroscope data
Figure FDA0003043820940000012
Step 2, resolving by using a visual odometer according to visual sensor data S (k) to obtain a current pose T (k) of the camera;
step 3, constructing a semantic plane feature map based on visual sensor data between two adjacent image frames, matching a semantic plane of the current frame with a semantic landmark in a map, and obtaining an observation relation between the current key frame and the semantic landmark;
step 4, performing inertia pre-integration between two adjacent image frames based on inertial sensor data, wherein the inertial sensor data comprises accelerometer data and gyroscope data;
and 5, performing nonlinear optimization on the optimization function to realize pose solution based on the sum of the semantic plane observation residual error, the visual odometer relative pose observation residual error and the inertia pre-integration residual error as a combined optimization function.
And 6, outputting carrier navigation information and camera internal parameters, and returning to the step 1.
2. The visual inertial integrated navigation method fused with semantic features according to claim 1,
the method for acquiring the current pose of the camera specifically comprises the following steps: firstly, ORB features are extracted from two adjacent frames, then the relative pose between the two key frames is calculated by utilizing the ORB feature matching relationship between the two frames and pnp, and the pose T (k) of the camera at the moment k is obtained by accumulating the relative poses.
3. The visual inertial integrated navigation method fused with semantic features according to claim 2,
acquiring plane information based on visual sensor data between two adjacent image frames, and constructing semantic plane features based on semantic categories, centroids, normal directions and horizontal and vertical types of the planes:
· sp={px,py,pz}
· sn={nx,ny,nz,nd}
·
Figure FDA0003043820940000021
· sccorresponding detected semantic object classes
wherein ,spIs a plane centroid, snIs a plane normal parameter, soAs plane class labels (horizontal/vertical), scA plane semantic class label which depends on the semantic object class corresponding to the plane;
constructing a semantic plane feature map based on the constructed semantic plane features;
defining initial semantic road sign, detecting semantic plane S for each framekAnd respectively carrying out matching judgment on the category, the normal direction and the centroid of the initial semantic road sign to obtain the observation relation between the current key frame and the semantic road sign.
4. The visual inertial integrated navigation method fused with semantic features according to claim 1, wherein the inertial sensor data obtained at the time k comprises accelerometer data from the time k-1 to the time k
Figure FDA0003043820940000022
And gyroscope data
Figure FDA0003043820940000023
For constructing an inertial sensor measurement model:
Figure FDA0003043820940000024
Figure FDA0003043820940000025
wherein ,na and nωAccelerometer and gyroscope white noise, respectively;
Figure FDA0003043820940000026
and
Figure FDA0003043820940000027
random walk of accelerometer and gyroscope, its derivative is white noise;
Figure FDA0003043820940000028
is an ideal value measured by the accelerometer,
Figure FDA0003043820940000029
is an ideal value measured by a gyroscope; gWIs the gravity under the navigation system;
Figure FDA00030438209400000210
navigating a rotation matrix from a coordinate system to a body coordinate system at a sampling moment;
a plurality of inertial sensor data are arranged between two adjacent image frames, and all the inertial sensor data between the two adjacent image frames are subjected to pre-integration in an iteration mode:
Figure FDA0003043820940000031
Figure FDA0003043820940000032
Figure FDA0003043820940000033
wherein ,
Figure FDA0003043820940000034
in order to pre-integrate the position,
Figure FDA0003043820940000035
for speed pre-integration, the rotation is represented by a quaternion y,
Figure FDA0003043820940000036
pre-integration for rotation; at the beginning of the process, the process is carried out,
Figure FDA0003043820940000037
and
Figure FDA0003043820940000038
is a non-volatile organic compound (I) with a value of 0,
Figure FDA0003043820940000039
is unit quaternion, R (gamma) represents the conversion of quaternion into rotation matrix,
Figure FDA00030438209400000310
a multiplication operation representing a quaternion;
the frequency of the visual data is consistent with the inertia pre-integral frequency to obtain the inertia pre-integral between two adjacent image frames
Figure FDA00030438209400000311
And
Figure FDA00030438209400000312
5. the visual inertial integrated navigation method fused with semantic features according to claim 1, wherein the optimization function is:
Figure FDA00030438209400000313
wherein the Semantic part represents the residual of the Semantic roadmap observation,
Figure FDA00030438209400000314
representing a semantic landmark observation error observed by a certain frame of a camera, wherein the VO part represents an observation residual error of a relative pose in a visual odometer, the third term is an IMU pre-integration estimation residual error, rho (·) represents a robust kernel function, and Σ-1And an information matrix which is the inverse of the covariance and represents the pre-estimation of each constraint precision, wherein i and j refer to the ith and jth image frames respectively, and k refers to the kth semantic landmark.
6. The visual inertial integrated navigation method fused with semantic features according to claim 5,
the semantic roadmap observation error
Figure FDA00030438209400000315
Figure FDA00030438209400000316
wherein ,
Figure FDA0003043820940000041
for a position observation of a semantic landmark in the camera coordinate system,
Figure FDA0003043820940000042
and
Figure FDA0003043820940000043
respectively representing the pose of the camera at the ith frame and the landmark L for variables to be optimizedkA location in the world system;
calculate e about
Figure FDA0003043820940000044
And
Figure FDA0003043820940000045
a Jacobian matrix of errors;
the pose observation residual
Figure FDA0003043820940000046
Figure FDA0003043820940000047
wherein ,
Figure FDA0003043820940000048
for the observation of the relative pose of the VO at the ith frame and the jth frame,
Figure FDA0003043820940000049
and
Figure FDA00030438209400000410
is a variable to be optimized;
respectively solve the e-related pose
Figure FDA00030438209400000411
A Jacobian matrix of errors;
calculating inertial pre-integral residual
Figure FDA00030438209400000412
The difference between the predicted value and the pre-integration value between two adjacent image frames is obtained.
7. The visual inertial integrated navigation method fused with semantic features according to claim 6, wherein e is related to in the semantic roadmap observation error
Figure FDA00030438209400000413
And
Figure FDA00030438209400000414
the Jacobian matrix of errors is:
Figure FDA00030438209400000415
wherein xi is the pose
Figure FDA00030438209400000416
Lie algebraic representation of [ X ]i′Yi′Zi′]TIs the semantic plane coordinate of i moment camera system, fx、fyIs the focal length in the camera's internal parameters,
Figure FDA00030438209400000417
is composed of
Figure FDA00030438209400000418
Of the rotating component of (1).
8. The visual inertial integrated navigation method fused with semantic features according to claim 6, wherein the pose observation residual error is e-pose-related
Figure FDA00030438209400000419
The Jacobian matrix of errors is:
Figure FDA0003043820940000051
9. the visual inertial integrated navigation method based on the semantic feature fusion of claim 6, wherein the inertial pre-integration residual error formula is as follows:
Figure FDA0003043820940000052
wherein ,
Figure FDA0003043820940000053
a rotation matrix from a j moment navigation coordinate system to a body coordinate system;
Figure FDA0003043820940000054
Figure FDA0003043820940000055
respectively being the positions of the organism coordinate system at the time i and the time j under the navigation coordinate system;
Figure FDA0003043820940000056
the speed of the organism coordinate system at the time i and the speed of the organism coordinate system at the time j are respectively under the navigation coordinate system; Δ tiThe time interval between two adjacent image frames;
Figure FDA0003043820940000057
quaternions of the rotation of the organism coordinate system at the time i and the time j in the navigation coordinate system are respectively;
Figure FDA0003043820940000058
respectively moving the accelerometers in the machine body coordinate system at the time i and the time j randomly;
Figure FDA0003043820940000059
the gyroscope respectively moves randomly at i moment and j moment under the coordinate system of the body, [ gamma ]]xyzRepresenting the x, y, z components taking the quaternion y.
CN202110467584.3A 2021-04-28 2021-04-28 Visual inertial integrated navigation method integrating semantic features Active CN113188557B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110467584.3A CN113188557B (en) 2021-04-28 2021-04-28 Visual inertial integrated navigation method integrating semantic features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110467584.3A CN113188557B (en) 2021-04-28 2021-04-28 Visual inertial integrated navigation method integrating semantic features

Publications (2)

Publication Number Publication Date
CN113188557A true CN113188557A (en) 2021-07-30
CN113188557B CN113188557B (en) 2023-10-20

Family

ID=76979977

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110467584.3A Active CN113188557B (en) 2021-04-28 2021-04-28 Visual inertial integrated navigation method integrating semantic features

Country Status (1)

Country Link
CN (1) CN113188557B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113465598A (en) * 2021-08-04 2021-10-01 北京云恒科技研究院有限公司 Inertia combination navigation system suitable for unmanned aerial vehicle
CN114152937A (en) * 2022-02-09 2022-03-08 西南科技大学 External parameter calibration method for rotary laser radar
CN115493612A (en) * 2022-10-12 2022-12-20 中国第一汽车股份有限公司 Vehicle positioning method and device based on visual SLAM

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109945858A (en) * 2019-03-20 2019-06-28 浙江零跑科技有限公司 It parks the multi-sensor fusion localization method of Driving Scene for low speed
US10366508B1 (en) * 2016-08-29 2019-07-30 Perceptin Shenzhen Limited Visual-inertial positional awareness for autonomous and non-autonomous device
CN110794828A (en) * 2019-10-08 2020-02-14 福瑞泰克智能***有限公司 Road sign positioning method fusing semantic information
CN111156997A (en) * 2020-03-02 2020-05-15 南京航空航天大学 Vision/inertia combined navigation method based on camera internal parameter online calibration
CN111325842A (en) * 2020-03-04 2020-06-23 Oppo广东移动通信有限公司 Map construction method, repositioning method and device, storage medium and electronic equipment
CN111693047A (en) * 2020-05-08 2020-09-22 中国航空工业集团公司西安航空计算技术研究所 Visual navigation method for micro unmanned aerial vehicle in high-dynamic scene
CN112348921A (en) * 2020-11-05 2021-02-09 上海汽车集团股份有限公司 Mapping method and system based on visual semantic point cloud
CN112484725A (en) * 2020-11-23 2021-03-12 吉林大学 Intelligent automobile high-precision positioning and space-time situation safety method based on multi-sensor fusion

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10366508B1 (en) * 2016-08-29 2019-07-30 Perceptin Shenzhen Limited Visual-inertial positional awareness for autonomous and non-autonomous device
CN109945858A (en) * 2019-03-20 2019-06-28 浙江零跑科技有限公司 It parks the multi-sensor fusion localization method of Driving Scene for low speed
CN110794828A (en) * 2019-10-08 2020-02-14 福瑞泰克智能***有限公司 Road sign positioning method fusing semantic information
CN111156997A (en) * 2020-03-02 2020-05-15 南京航空航天大学 Vision/inertia combined navigation method based on camera internal parameter online calibration
CN111325842A (en) * 2020-03-04 2020-06-23 Oppo广东移动通信有限公司 Map construction method, repositioning method and device, storage medium and electronic equipment
CN111693047A (en) * 2020-05-08 2020-09-22 中国航空工业集团公司西安航空计算技术研究所 Visual navigation method for micro unmanned aerial vehicle in high-dynamic scene
CN112348921A (en) * 2020-11-05 2021-02-09 上海汽车集团股份有限公司 Mapping method and system based on visual semantic point cloud
CN112484725A (en) * 2020-11-23 2021-03-12 吉林大学 Intelligent automobile high-precision positioning and space-time situation safety method based on multi-sensor fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JIE JIN: "Localization_Based_on_Semantic_Map_and_Visual_Inertial_Odometry", 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) *
周武根: "动态环境下稠密视觉同时定位与地图构建方法研究", 中国博士学位论文全文数据库(信息科技辑) *
谢诗超: "多传感器融合的自动驾驶点云地图构建与更新方法研究", 中国优秀硕士学位论文全文数据库(工程科技Ⅱ辑) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113465598A (en) * 2021-08-04 2021-10-01 北京云恒科技研究院有限公司 Inertia combination navigation system suitable for unmanned aerial vehicle
CN113465598B (en) * 2021-08-04 2024-02-09 北京云恒科技研究院有限公司 Inertial integrated navigation system suitable for unmanned aerial vehicle
CN114152937A (en) * 2022-02-09 2022-03-08 西南科技大学 External parameter calibration method for rotary laser radar
CN115493612A (en) * 2022-10-12 2022-12-20 中国第一汽车股份有限公司 Vehicle positioning method and device based on visual SLAM
WO2024077935A1 (en) * 2022-10-12 2024-04-18 中国第一汽车股份有限公司 Visual-slam-based vehicle positioning method and apparatus

Also Published As

Publication number Publication date
CN113188557B (en) 2023-10-20

Similar Documents

Publication Publication Date Title
CN109307508B (en) Panoramic inertial navigation SLAM method based on multiple key frames
CN111595333B (en) Modularized unmanned vehicle positioning method and system based on visual inertia laser data fusion
CN112734852B (en) Robot mapping method and device and computing equipment
CN113781582B (en) Synchronous positioning and map creation method based on laser radar and inertial navigation combined calibration
CN113188557A (en) Visual inertial integrated navigation method fusing semantic features
WO2020038285A1 (en) Lane line positioning method and device, storage medium and electronic device
CN110044354A (en) A kind of binocular vision indoor positioning and build drawing method and device
Panahandeh et al. Vision-aided inertial navigation based on ground plane feature detection
CN110095116A (en) A kind of localization method of vision positioning and inertial navigation combination based on LIFT
CN107941217B (en) Robot positioning method, electronic equipment, storage medium and device
CN112634451A (en) Outdoor large-scene three-dimensional mapping method integrating multiple sensors
CN107909614B (en) Positioning method of inspection robot in GPS failure environment
CN105953796A (en) Stable motion tracking method and stable motion tracking device based on integration of simple camera and IMU (inertial measurement unit) of smart cellphone
CN111337943B (en) Mobile robot positioning method based on visual guidance laser repositioning
CN111156997B (en) Vision/inertia combined navigation method based on camera internal parameter online calibration
CN114526745A (en) Drawing establishing method and system for tightly-coupled laser radar and inertial odometer
CN111060099B (en) Real-time positioning method for unmanned automobile
CN113220818B (en) Automatic mapping and high-precision positioning method for parking lot
CN114529576A (en) RGBD and IMU hybrid tracking registration method based on sliding window optimization
CN110675455A (en) Self-calibration method and system for car body all-around camera based on natural scene
CN116518984B (en) Vehicle road co-location system and method for underground coal mine auxiliary transportation robot
CN115371665A (en) Mobile robot positioning method based on depth camera and inertia fusion
CN112179373A (en) Measuring method of visual odometer and visual odometer
CN115574816A (en) Bionic vision multi-source information intelligent perception unmanned platform
CN112731503A (en) Pose estimation method and system based on front-end tight coupling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant