CN111595332B - Full-environment positioning method integrating inertial technology and visual modeling - Google Patents

Full-environment positioning method integrating inertial technology and visual modeling Download PDF

Info

Publication number
CN111595332B
CN111595332B CN202010284018.4A CN202010284018A CN111595332B CN 111595332 B CN111595332 B CN 111595332B CN 202010284018 A CN202010284018 A CN 202010284018A CN 111595332 B CN111595332 B CN 111595332B
Authority
CN
China
Prior art keywords
positioning
image
inertial
coordinate system
measurement unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010284018.4A
Other languages
Chinese (zh)
Other versions
CN111595332A (en
Inventor
许凯
鲍翊平
王栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Shenxun Technology Co ltd
Original Assignee
Ningbo Shenxun Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Shenxun Information Technology Co ltd filed Critical Ningbo Shenxun Information Technology Co ltd
Priority to CN202010284018.4A priority Critical patent/CN111595332B/en
Publication of CN111595332A publication Critical patent/CN111595332A/en
Application granted granted Critical
Publication of CN111595332B publication Critical patent/CN111595332B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/18Stabilised platforms, e.g. by gyroscope

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of positioning methods, in particular to a full-environment positioning method integrating an inertial technology and visual modeling. It comprises the following steps: s1, setting positions of an inertial measurement unit and image acquisition equipment to enable the inertial measurement unit and the image acquisition equipment to be relatively static; s2, positioning through data acquired by an inertial measurement unit to obtain positioning coordinates; meanwhile, performing visual modeling through data acquired by the image acquisition equipment to obtain a visual model; and S3, unifying the visual model and a coordinate system of the positioning coordinates, then performing clock synchronization, and finally displaying the positioning coordinates in the visual model. The positioning method can be used in unfamiliar environments without depending on external equipment, and has the advantages of low cost and high positioning accuracy.

Description

Full-environment positioning method integrating inertial technology and visual modeling
Technical Field
The invention relates to the technical field of positioning methods, in particular to a full-environment positioning method integrating an inertial technology and visual modeling.
Background
Because the inertial navigation positioning technology has drift accumulation characteristics, the positioning precision is lower and lower along with the increase of time and distance, and the positioning needs in life cannot be met, so that other positioning technologies are needed to be used together with inertial navigation, and the inertial navigation positioning technology is generally satellite positioning and UWB positioning.
However, aiming at the satellite positioning technology, the biggest defect is that the satellite positioning technology is greatly influenced by signals, and even if the satellite positioning technology is outdoors, the satellite positioning technology cannot be accurately positioned by being matched with inertial navigation once the satellite positioning technology is in a high-rise condition; the disadvantage of UWB positioning is the large number of base stations that need to be deployed, the unfamiliar environment, and the high cost.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: the full-environment positioning method integrating the inertial technology and the visual modeling is provided, and can be used in unfamiliar environments without depending on external equipment, so that the cost is low, and the positioning accuracy is high.
The technical scheme adopted by the invention is as follows: a full environment positioning method integrating inertial technology and visual modeling comprises the following steps:
s1, setting positions of an inertial measurement unit and image acquisition equipment to enable the inertial measurement unit and the image acquisition equipment to be relatively static;
s2, positioning through data acquired by an inertial measurement unit to obtain positioning coordinates; meanwhile, performing visual modeling through data acquired by the image acquisition equipment to obtain a visual model;
and S3, unifying the visual model and a coordinate system of the positioning coordinates, then performing clock synchronization, and finally displaying the positioning coordinates in the visual model.
Preferably, the clock synchronization in step S3 means by continuously converging the deviation to zero in successive optimization.
Preferably, the positioning by the inertial measurement unit in step S2 specifically includes the following steps:
SA1, measuring horizontal acceleration and gravity acceleration from an inertial navigation measurement unit;
SA2, obtaining a moving direction and a moving speed through horizontal acceleration and gravity acceleration, and obtaining a positioning position according to an initial position;
SA3, calculating errors and correcting.
Preferably, the errors in step SA3 include a velocity error, a position error, and an attitude error.
Preferably, the visual modeling of the data acquired by the image acquisition device in the step S2 specifically includes the following steps:
SB1, preprocessing an image acquired by image acquisition equipment, and converting the image into characteristic points with main directions;
SB2, performing closed loop detection;
SB3, determining depth data of each characteristic point to obtain three-dimensional coordinates of each characteristic point;
SB4, obtaining a relative pose through feature point matching;
SB5, 2D-3D conversion is carried out, and three-dimensional visual modeling is realized.
Preferably, step SB1 comprises the following specific steps:
SB11, gray processing is carried out on the image acquired by the image acquisition equipment;
SB12, using Gaussian filter processing to the image after gray processing;
SB13, performing bias guide processing on each pixel point by using a black plug matrix on the Gaussian filtered image, so as to find edge points in the image;
SB14, screening edge points to obtain final characteristic points;
SB15, giving the main direction of the final feature point.
Compared with the prior art, the method provided by the invention has the following advantages: by combining the inertial navigation positioning technology with three-dimensional visual modeling, positioning can be realized without depending on external auxiliary equipment, an actual map of an unfamiliar environment can be generated in real time, on-site real-time environment data is provided for a commander for decision reference, and inertial positioning results are corrected by using the visual modeling map, so that long-time high precision and no drift of inertial positioning can be ensured.
Drawings
FIG. 1 is a schematic diagram of integrated pixel values in a full environment localization method combining inertial techniques and visual modeling according to the present invention.
FIG. 2 is a schematic diagram of a Gaussian filter template and box filter in a full environment positioning method combining inertial technology and visual modeling according to the present invention.
Fig. 3 is a schematic diagram of three layers of images in a full environment positioning method integrating inertial technology and visual modeling.
FIG. 4 is a schematic diagram of a Haar wavelet filter in a full environment positioning method integrating inertial technology and visual modeling according to the present invention.
FIG. 5 is an explanatory diagram of epipolar constraints in a full environment positioning method integrating inertial technology with visual modeling in accordance with the present invention.
Fig. 6 is a schematic diagram of clock synchronization in a full environment positioning method integrating inertial technology and visual modeling according to the present invention.
Detailed Description
The present invention is further described below by way of the following embodiments, but the present invention is not limited to the following embodiments.
The invention realizes final positioning calculation mainly from two angles of inertial technology and visual modeling.
The positions of the inertial measurement unit (which may also be an inertial navigation module, comprising mainly three accelerometers) and the image acquisition device (comprising mainly a camera) need to be fixed so that they remain relatively stationary.
First, the specific force component f of each direction of the wearer is measured from three accelerometers inside the inertial measurement unit b For subsequent positioning calculations.
Figure BDA0002447802830000031
/>
In order to be able to vectorially split the specific force into horizontal acceleration and gravitational acceleration, the specific force in the carrier coordinate system (b-system) needs to be converted into the navigation coordinate system (n-system) in order to obtain the real ground acceleration value for effective ground positioning.
The transformation of the coordinate system is realized by means of cosine matrix
Figure BDA0002447802830000032
Namely: />
Figure BDA0002447802830000033
To obtain this->
Figure BDA0002447802830000034
The angle between the two coordinate systems is known:
heading angle phi: longitudinal axis y of the carrier b Projection of the point in time with respect to the tangential plane of the earth and the geographic north y-axis t The included angle of (2) is defined to be in a range of 0-360 DEG by taking the north-east direction as the forward direction.
Pitch angle θ: the included angle between the projection of the longitudinal axis of the carrier and the transverse axis of the carrier on the horizontal plane is positive direction upwards, and the range of the definition domain is-90 degrees to 90 degrees.
Roll angle γ: transverse axis x of the carrier b The included angle with the horizontal axis of the carrier projected on the horizontal plane is defined as the range of-180 degrees to 180 degrees by taking the right-tilting direction as the positive direction.
The conversion relationship between the carrier coordinate system (b system) and the navigation coordinate system (n system) is:
Figure BDA0002447802830000041
to facilitate the subsequent formula writing, gesture matrix is remembered
Figure BDA0002447802830000042
After obtaining the conversion matrix
Figure BDA0002447802830000043
After that, can be converted f b Obtaining f n For subsequent position calculation. However, f is directly used n It is obviously unreasonable to perform the next calculation because the carrier is in the earth's environment and is subject to the influence of gravity and earth's rotation, so that when considering the navigation coordinate system, this data needs to be added to the consideration, and the positioning result obtained would have a huge error.
Taking the gravity and the earth rotation into consideration, according to the relation between the absolute variability and the relative variability of the vector and Newton's second law, the basic equation of the inertial navigation system can be obtained as follows:
Figure BDA0002447802830000044
wherein:
Figure BDA0002447802830000045
-navigation of acceleration vectors in a coordinate system
f n Specific force vector (output of accelerometer) of navigation coordinate system
Figure BDA0002447802830000046
-earth rotationAngular velocity of
Figure BDA0002447802830000047
-angular velocity of carrier relative to earth movement
g n -gravity acceleration vector
Figure BDA0002447802830000048
-sum of the Ge acceleration and the centripetal acceleration of the earth
The definition of the coriolis acceleration is: when a moving point moves relative to a moving reference system in a linear manner, and the moving system rotates, the relative speed of the moving point changes, and a Ge-type acceleration is generated. In short, the coriolis acceleration is formed by the combined action of the relative motion and the pulling rotation.
This equation calculates the velocity of the carrier, where
Figure BDA0002447802830000049
Figure BDA00024478028300000410
Figure BDA00024478028300000411
Figure BDA0002447802830000051
R m And R is n The principal radius of curvature for the earth reference ellipsoid, namely:
Figure BDA0002447802830000052
the above f n
Figure BDA0002447802830000053
g n 、/>
Figure BDA0002447802830000054
The formula of (2) is brought into the velocity equation, and the last three axial velocities +.>
Figure BDA0002447802830000055
Figure BDA0002447802830000056
After the moving direction and the speed of the personnel can be calculated after the three axial speeds are obtained, the final position data of the personnel can be obtained by substituting the speed into an integral equation and adding the speed into the initial position of the personnel, and the position can be updated in real time according to the speed change.
Figure BDA0002447802830000057
Figure BDA0002447802830000058
Wherein L is 0 And lambda (lambda) 0 Time zero is the latitude and longitude of the carrier, respectively, and T represents the elapsed time.
Inevitably, any system always has errors, and in order to further improve the positioning accuracy of inertial navigation, final error calculation needs to be performed to optimize the positioning result.
Assume that the navigation coordinate system n in practice is the navigation coordinate system calculated by the computer
Figure BDA0002447802830000059
The slight angle differences are +.>
Figure BDA00024478028300000510
Then there is n series
Figure BDA00024478028300000511
Coordinate transformation matrix of system:
Figure BDA00024478028300000512
in the following, this transformation matrix will be used to find the errors of velocity, position, attitude, respectively.
Calculating a speed error:
the velocity calculation equation for the horizontal channel in the ideal case is:
Figure BDA0002447802830000061
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0002447802830000062
is the actual speed value; />
Figure BDA0002447802830000063
Is the actual specific force value; l (L) c Is the actual latitude value.
And, a set of expression formulas for actual speed are readily available:
Figure BDA0002447802830000064
wherein i=x, y, z,
Figure BDA0002447802830000065
representing the speed difference, i representing the accelerometer zero offset.
And (3) simultaneous equations, wherein the expression of the velocity error is as follows:
Figure BDA0002447802830000066
calculating a position error:
from the position calculation formula of the carrier
Figure BDA0002447802830000067
The longitude and latitude change rate under the actual error condition can be obtained as follows:
Figure BDA0002447802830000068
there are also actual error expressions:
Figure BDA0002447802830000069
/>
the expression of the error obtained by combining the above equations is:
Figure BDA00024478028300000610
calculating attitude errors:
known to have angular deviations of
Figure BDA00024478028300000611
The form written as a negative antisymmetric matrix is:
Figure BDA0002447802830000071
the previous conversion matrix may be changed to:
Figure BDA0002447802830000072
navigation coordinate system calculated by knowing relative computer of inertial measurement unit coordinate system s system
Figure BDA00024478028300000714
The system conversion matrix is:
Figure BDA0002447802830000073
and deriving the left and right sides of the equation to obtain:
Figure BDA0002447802830000074
then the properties contained in the cosine matrix
Figure BDA0002447802830000075
Substituting the above equation to obtain
Figure BDA0002447802830000076
Both sides multiply by the direction cosine matrix
Figure BDA0002447802830000077
And substitute into +.>
Figure BDA0002447802830000078
In the matrix, obtain:
Figure BDA0002447802830000079
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA00024478028300000710
the drift errors of three axes of the gyroscope in static state are expressed, and the drift errors of x, y and z are respectively epsilon x 、ε y 、ε z I.e. +.>
Figure BDA00024478028300000711
Deriving from the angular velocity calculation matrix obtained in the velocity calculation, the corresponding drift can be obtained:
Figure BDA00024478028300000712
/>
Figure BDA00024478028300000713
by combining the above equations, the attitude error expression can be:
Figure BDA0002447802830000081
by the positioning calculation and error elimination algorithm, the influence of angle errors on the positioning accuracy can be reduced to a great extent. However, inertial navigation has a further characteristic, namely-error accumulation. Although the corrected inertial navigation has an accuracy of 3% relative error, its accuracy decreases with time.
The invention adopts a visual modeling mode to correct the inertial navigation positioning result, and relies on modeling to restrict the drift of positioning points, thereby maintaining high-precision positioning for a long time. Namely: the inertial measurement unit acquires specific force components, and a lens group arranged on a person acquires a scene in real time for modeling.
First, a live view is captured by a lens group provided on a subject, and the live view is subjected to gradation processing, whereby each pixel value in the image is expressed as a gradation value. Meanwhile, the upper left corner of the image is taken as the origin of coordinates, the right is the positive direction of the x axis, and the downward is the positive direction of the y axis to establish a coordinate system. Then any pixel point (x i ,y i ) The sum of all pixel values in the rectangular region enclosed by the origin of coordinates is the integral pixel value.
By integrating the pixel values, an arbitrary rectangular integrated pixel value is desired, and as shown in fig. 1, the integrated pixel values at the four vertex angle positions of the rectangle are added or subtracted to obtain: Σ=a-B-c+d, a being the upper left point, B being the upper right point, C being the lower left point, D being the lower right point. Since the whole rectangle operation no longer needs to pay attention to accumulation of each pixel point in the rectangle, the operation amount of image processing is greatly reduced.
After obtaining the pixel value of each pixel, gaussian filtering is needed to process the obtained image, so as to ensure that the feature points finally obtained by us have scale independence. ( Namely: regardless of the size of the scene, machine vision is able to determine whether it is an image of the same scene. This is relevant for the implementation of closed loop detection in the post modeling process. )
Gaussian filtering is a process of weighted averaging the pixel values of an image, where the value of each pixel is obtained by weighted averaging its own value with other pixel values in the field. The closer the point weight is, the higher the weight is, and conversely, the lower the weight is. The image is subjected to blurring processing so as to simulate the imaging process of the machine when the distance from the object is from far to near.
The formula of the two-dimensional Gaussian filter is:
Figure BDA0002447802830000082
wherein x and y represent coordinates of the pixel point, sigma 2 Representing the variance of the pixel values.
After Gaussian blur is carried out on the image, the black plug matrix can be used for conducting partial guide processing on each pixel point, and therefore edge points (abrupt points) in the image are found.
The discriminant of the Hessian matrix is:
Figure BDA0002447802830000083
/>
since the gaussian kernel is normally distributed, the SURF algorithm uses a box filter to approximate the substitution of the gaussian filter, thereby increasing the speed of operation. As shown in fig. 2.
Obtaining an approximation of a black plug matrix:
det(H)=D xx *D yy -(0.9*D xy ) 2
wherein D is xy The multiplied factor of 0.9 is to balance the error introduced by using the box filter.
After all the mutation points in the image are found through the steps, the mutation points are screened one by one, and the final feature points are obtained. Since the feature point is an extreme point in the image, and this extreme value is not only compared with the surrounding 8 points, but also includes 9 points of the adjacent two layers.
As shown in fig. 3, the middle detection point is compared with 8 pixels in the 3×3 neighborhood of the image where the detection point is located, and 18 pixels in the 3×3 neighborhood of the upper layer and the lower layer, which are adjacent to each other, for 26 pixels in total. Therefore, in order to enable each mutation point to have an upper layer image and a lower layer image for comparison, only one image is far from enough, and the construction of a scale space is needed at this time.
The construction method mainly comprises the following steps: the size of the original image is kept unchanged, the size of the template of the box filter is only changed to obtain images with different scales, and the obtained images are used for forming a pyramid model.
First, a box filter of 9*9 size is started, and the box filter corresponds to a second-order gaussian filter of σ=1.2. Since a scale space is made up of groups, each group in turn contains layers. Layers are generated by filters of different sizes, while groups are generated by different sizes of the starting and spacing. The corresponding relation between the size of the filtering template and sigma is as follows:
Figure BDA0002447802830000091
examples:
Figure BDA0002447802830000092
it is noted that in order to maintain continuity in the scale space, to avoid the situation where the first and last layers of each group are not compared one to the other, there must be a partial layer overlap between adjacent groups, while the box filters in each group are progressively larger in size.
The obtained mutation points can be screened in the constructed scale space, and extreme points, namely characteristic points, in the mutation points are obtained. However, according to the actual situation, we need to perform corresponding preferential treatment on the feature points.
If the obtained feature points are fewer, the feature points are directly used without preferential treatment.
If the feature points are more, the next screening is needed, and only the feature points with larger contrast are reserved. The specific method comprises the following steps:
setting a threshold value, recording candidate feature points x, defining the offset as delta x, and applying taylor expansion to D (x), wherein the contrast is the absolute value of D (x) |D (x):
the product can be obtained by the method,
Figure BDA0002447802830000093
since x is the extreme point of D (x), the above formula is derived and set to 0 to obtain
Figure BDA0002447802830000094
Then substituting the obtained Deltax into Taylor expansion of D (x)
Figure BDA0002447802830000095
/>
Let the threshold of contrast be T, if
Figure BDA0002447802830000096
The feature point remains and is otherwise removed. This ensures that the remaining feature points are high contrast points.
When the threshold is actually selected, the threshold is changed according to the requirements of the number and the accuracy of the feature points in practical application. The larger the threshold, the better the robustness of the feature points obtained, but the fewer the number of feature points obtained. The threshold value can be suitably lowered when processing images of a simple scene. In a complex image, the number of feature point changes after the image is rotated or blurred is large, and the threshold value needs to be properly improved in the test.
After the final feature points are determined, the main directions of the feature points are given, so that the feature points have rotation invariance, namely: two images with different rotation angles are matched, and still the matching can be performed through the main direction of each feature point. This step is also relevant for closed loop detection in the post modeling process.
The method mainly comprises the following steps:
i confirmation of the direction of the interest points
Each point of interest is first assigned a directional characteristic. In the circular field with a certain characteristic point as a circle center and 6S (S is a scale corresponding to the interest point) as a radius, a Haar wavelet template with the size of 4S is used for processing an image, and Haar wavelet responses in the x and y directions are obtained.
The templates of Haar wavelets are shown in fig. 4:
wherein the left-hand template represents the response in the X-direction, the right-hand template represents the response in the y-direction, black represents-1, and white represents +1. After processing the circular field, the responses in the x, y directions corresponding to each point in the field are obtained, and then weighted by a gaussian function (σ=2s) centered on the point of interest.
A60-degree sector is formed in the circular field, the sum of haar wavelet characteristics in the sector is counted, the sector is rotated, and the sum of wavelet characteristics is counted. The direction in which the sum of the wavelet features is largest is the main direction.
II generating feature point descriptors
And selecting a square frame around the feature points, wherein the direction is the main direction of the key points, and the side length is 20S. Dividing it into 16 regions (side length 5S), each region counting the Haar wavelet characteristics of 25 pixels in the horizontal and vertical directions (all determined by the main direction of the square frame),
the wavelet characteristics include the sum of horizontal direction values, the sum of horizontal direction absolute values, the sum of vertical direction values, and the sum of vertical direction absolute values (the absolute values are accumulated in order to include polarity information of the intensity change in the descriptor as well).
All the steps are preprocessing of the images, converting the images acquired by the lens group into feature points with main directions, and then applying the feature points.
First is closed loop detection. Since only the correlation between adjacent times is considered in the whole algorithm, the problem of error accumulation is inevitably encountered, and the track drift of the whole VSLAM is more and more serious with the continuous extension of time. At this point, the track needs to be corrected using closed loop detection, pulling the shifted point back to the original position. After the characteristic points are obtained from the algorithm, the KD-Tree algorithm is adopted to match the nearest neighbor characteristic points, whether the two images are in the same scene or not is identified, and accordingly whether a closed loop is formed or not is judged.
The KD-Tree algorithm is mainly based on dividing data into two parts according to size. All values on the left branch are smaller than the root value, and all values on the right branch are larger than the root value, and the whole value is continuously divided under the condition until all subsets cannot be divided.
When two-dimensional data such as pixel positions are encountered, the variances are calculated for x and y respectively, and then the direction with large variances is divided, because the large variances indicate that the data have higher dispersion in the direction, and the division is more obvious.
Examples: there are { (4, 7), (2, 3), (7, 2), (8, 1), (5, 4), (9, 6) }6 pixel positions.
The variance is calculated for x, y, respectively. The variance in the x-direction is greater. Thus first divided in the x-direction: 2,4,5,7,8,9. Taking the intermediate value 7, there are left branches (2, 4, 5) and right branches (8, 9).
Then dividing in the y direction, taking the median value 4 from 3,4 and 7, and taking 6 from 1 and 6 to obtain the binary tree.
In this way, when the feature point matching is performed, the shortest path searching and matching can be directly performed on the binary tree according to the size of the numerical value, so that the most efficient image matching is realized.
However, if there is only one-way search, then the result is not necessarily the closest point to the Q point, and a trace-back operation is also required, i.e., to determine whether there are more points in the unviewed (or visited) tree branches that are closer to Q, or whether the distance |q (k) -m| "between Q and the unviewed tree branches is less than the distance of Q to the current nearest neighbor. From the geometric space, it is determined whether or not a hyper-sphere centered on Q and at dis intersects a hyper-rectangle represented by an unvisited tree branch. If the two branches intersect, the distance represented by the Q and the branches is calculated continuously, otherwise, the distance is skipped directly and not considered.
If the intersection occurs indicating that the backtracking point is closer to the Q point, otherwise, the result point is already the nearest point. Then, by comparing the pixel values between the nearest point and the Q point, since there are tens of Q in one image, tens of comparison relations are formed, and when the values in the correspondence relation are similar to the feature description and more similar to the adjacent frames of the image, then it can be considered that a closed loop is formed at this time.
After the track drift problem is solved, depth data (distance relative to the lens group) of each feature point is then determined.
Here we use the principle of epipolar constraint. As shown in fig. 5, a and c are the centers of the cameras, X is the image point in the image, then it can be known that the object X is necessarily on the extension line of the straight line cx, at this time, if there is image data of another lens, an extension line can be made in the same way, and the intersection point of the two lines is the real position of the object, so as to determine the distance between the specific object and the machine. On the basis, the first camera can be used as the origin of coordinates, the first camera is assumed to be not translated and rotated, and then the rotation and translation of the second camera relative to the first camera are judged through image point movement, so that the pose of the camera is known.
The derivation formula is as follows:
let point p= [ X, Y, Z of three-dimensional space with first camera as coordinate system]Which are p at the image points of the two cameras 1 ,p 2 . Since the center of the first camera serves as the origin of the world coordinate system, that is, the first camera has no rotation or translation, it is possible to obtain by the pinhole camera model:
p 1 =KP,p 2 =K(RP+t)
where K is the internal reference of the camera, R, t is the rotation and translation of the second camera relative to the first camera.
From p 1 P=k can be obtained by=kp -1 p 1 The second formula is introduced:
p 2 =K(RK -1 p 1 +t)
two sides are simultaneously multiplied by K -1 Obtaining
K -1 p 2 =RK -1 p 1 +t
Let x be 1 =K -1 p 1 ,x 2 =K -1 p 2 Substituted into
x 2 =Rx 1 +t
The two sides simultaneously multiply the anti-symmetric matrix t x of the vector t by the left, since t x t=0, cancel t
t×x 2 =t×Rx 1
Two sides are multiplied by left at the same time
Figure BDA0002447802830000111
Figure BDA0002447802830000112
Due to t x 2 Is vector t and vector x 2 Is perpendicular to both vector t and vector x 2 So to the left
Figure BDA0002447802830000113
Obtaining
Figure BDA0002447802830000114
And then x is 1 ,x 2 Change off
Figure BDA0002447802830000115
The above equation is another representation of epipolar constraints, which contains only image points, rotation and translation of the camera, the middle matrix being the basis matrix F.
Figure BDA0002447802830000121
Wherein f=k -T t×RK -1
From formula f=k -T t×RK -1 It can be seen that if the internal parameters K of the camera are known, intermediate matrices can be extracted
E=t×R
E is called the essential matrix, which differs from the base matrix by the internal reference K of the camera. And calculating the pose of the camera through the matched point pairs.
From the above, the following relationship holds for the matched pixels p1, p2 and the basis matrix F:
Figure BDA0002447802830000122
wherein f=k -T t×RK -1
That is, the basis matrix F of the two views can be calculated only by the matched pair of points (minimum 7 pairs), and then the pose of the camera is decomposed from F.
The relative pose of the camera can be estimated by feature point matching:
1. extracting characteristic points of the two images and matching
2. Computing a basis matrix F for two views using matching image points
3. Decomposing the basis matrix F to obtain a rotation matrix R and a translation vector t of the camera
After the corresponding relation between the space point and the pixel point in the image is acquired by the method, final 2D-3D coordinate conversion can be performed, and the three-dimensional modeling of the VSLAM is realized.
That is, the correspondence of one spatial point [ x, y, z ] and its pixel coordinates [ u, v, d ] (d depth data) in the image is such that:
Figure BDA0002447802830000123
wherein f x ,f y Refers to the focal length of the camera in both x, y axes, cx, cy refers to the aperture center of the camera, and s refers to the scaling factor of the depth map. This formula is pushed from (x, y, z) to (u, v, d). Conversely, we can write it as a known (u, v, d) and derive the way of (x, y, z).
Figure BDA0002447802830000124
A point cloud model can be constructed from this formula.
In general we will put f x ,f y The four parameters cx, cy are defined as the internal matrix C of the camera, i.e. the parameters that will not change after the camera is finished. Given the internal reference K, the spatial position and pixel coordinates of each point can be described by a simple matrix model:
Figure BDA0002447802830000125
where R and t are the pose of the camera. R represents a rotation matrix, and t represents a displacement vector. Because we now do a single point cloud, the camera is not considered to be rotating and translating. Therefore, R is set to unity matrix I and t is set to zero. s is the ratio of the data given in the depth map to the actual distance. Since depth maps are given by short (mm units), s is typically 1000.
The present invention combines these two techniques by unifying the coordinate system and performing two steps of clock synchronization.
Coordinate system
In general, two technologies have independent coordinate systems, and in order to enable the inertial positioning points to be positioned and displayed in a model generated by visual modeling with high precision, the two technologies of visual modeling and inertial positioning are required to be located in the same coordinate system. Only then can the movement trajectories of a fixed point in space coincide within the coordinate system of the two techniques.
The invention integrates and designs the functional modules of visual modeling and inertial positioning, namely: assembled in a single product such that both remain relatively stationary. This means that the angles between the camera coordinate system and the carrier coordinate system corresponding to the visual modeling and inertial positioning are fixed after the product is manufactured. The z axis of the camera coordinate system is the optical axis of the camera, the x and y axes are respectively parallel to the x and y axes of the imaging plane, and the origin is positioned in the optical center of the camera; the carrier coordinate system is a coordinate system formed by three accelerometers in the corresponding inertial positioning module.
Then, given the two coordinate systems, substituting the two coordinate systems into the conversion formulas of the carrier coordinate system and the navigation coordinate system can obtain the corresponding conversion matrix t=
Figure BDA0002447802830000131
Wherein, the liquid crystal display device comprises a liquid crystal display device,
θ x : the included angle between the projection of the longitudinal axis y-axis of the camera coordinate system on the xy-plane of the carrier coordinate system and the y-axis of the carrier coordinate system takes the clockwise direction as the forward direction, and the definition domain is 0-360 degrees.
θ y : the included angle between the y-axis of the camera coordinate system and the projection of the y-axis of the camera coordinate system in the xy-plane of the carrier coordinate system is positive upwards, and the definition range is-90 degrees to 90 degrees.
θ z : the included angle between the projection of the x axis of the camera coordinate system and the projection of the x axis on the xy plane of the carrier coordinate system is positive in the right inclination direction, and the definition range is-180 degrees to 180 degrees.
Then the formula can be: a=tb, and the camera coordinate system is substituted into B, so that the obtained new coordinate system can be overlapped with the inertial coordinate system, thereby ensuring that the positioning tracks of the two technologies overlap, and the positioning tracks are accurately displayed in the map model modeled in real time.
Clock synchronization
In the process of coordinate system conversionAfter the two coordinate systems are coincident, the resulting anchor points of the two techniques should be coincident in theory, however in practice, the results of the two techniques remain different without taking into account drift. The problem that all hardware systems have trigger delay, transmission delay, clock synchronization inaccuracy and the like causes that each device cannot obtain positioning results at the same time, so that the results are deviated from each other. As shown in FIG. 6, the bias on the camera and IMU data time stamps is taken as part of the state variable, denoted as t d Let t IMU =t cam +t d So that the time stamps between the inertial measurement unit and the image acquisition device (camera) can coincide.
Let the time stamp corresponding to the nth frame be t n The corresponding real sampling time point is t n -t d . At this time, a feature point I on the frame image k Its correspondent coordinate is [ x ] k y k ] T Then the relationship between the frame and the true feature point coordinates is:
I k true =[x k y k ] T -t d V k
Wherein V is k Representing t d The movement speed of the feature point in the time period is due to the fact that under normal conditions t d Is very small, so the feature points can be regarded as uniform motion, i.e. V k Is a constant.
Figure BDA0002447802830000132
Then substituting this case into the three-dimensional coordinate system, using I k true Replace I k And time deviation parameter t d The residual term introduced into the optimization equation camera, i.e., the re-projection error, can be written as:
e k =I k true -π(R k (P l -p k ))
Wherein P is l Representing three-dimensional coordinates x l y l z l ] T ,(R k p k ) Representing the pose of the camera.
Since even the error is not a fixed value, the minimum of the errors needs to be taken (it is assumed that the error should not be large in normal cases), then there are:
Figure BDA0002447802830000141
wherein e p -H p χ represents an a priori factor, e B (I k+1 χ) represents the error term produced by the inertial system, B represents the individual measurements taken by the inertial device,
Figure BDA0002447802830000142
representing error terms generated by visual modeling, C representing feature points acquired by the visual modeling and observed at least twice, wherein the error terms comprise t d Is a piece of information of (a).
Finally, the obtained minimum t d Substitute back to t 'in the initial time offset' cam =t cam -t d Then using the newly obtained t' cam The above operations are performed again by establishing a new time stamp, so that the deviation will continuously converge to zero in the successive optimization, and finally clock synchronization is realized.
Therefore, the two technologies simultaneously meet the requirements of synchronization of a coordinate system and a clock, and accurate positioning of the same point can be realized. The real-time modeling obtained by the visual modeling technology is simultaneously transmitted back to the background command end, so as to provide on-site real environment simulation for commanders. The method realizes long-time high-precision positioning and map information simulation feedback of strange environment without depending on an external base station.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; while the invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will appreciate that modifications may be made to the techniques described in the foregoing embodiments, or that equivalents may be substituted for elements thereof; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (4)

1. The full environment positioning method integrating the inertial technology and the visual modeling is characterized by comprising the following steps of:
s1, setting positions of an inertial measurement unit and image acquisition equipment to enable the inertial measurement unit and the image acquisition equipment to be relatively static;
s2, positioning through data acquired by an inertial measurement unit to obtain positioning coordinates; meanwhile, performing visual modeling through data acquired by the image acquisition equipment to obtain a visual model;
s3, unifying the visual model and a coordinate system of the positioning coordinates, then performing clock synchronization, and finally displaying the positioning coordinates in the visual model;
the positioning by the inertial measurement unit in step S2 specifically includes the following steps:
SA1, measuring horizontal acceleration and gravity acceleration from an inertial navigation measurement unit;
SA2, obtaining a moving direction and a moving speed through horizontal acceleration and gravity acceleration, and obtaining a positioning position according to an initial position;
SA3, correcting calculation errors;
the step S2 of performing visual modeling on the data acquired by the image acquisition equipment specifically comprises the following steps:
SB1, preprocessing an image acquired by image acquisition equipment, and converting the image into characteristic points with main directions;
SB2, performing closed loop detection;
SB3, determining depth data of each characteristic point to obtain three-dimensional coordinates of each characteristic point;
SB4, obtaining a relative pose through feature point matching;
SB5, 2D-3D conversion is carried out to realize three-dimensional visual modeling;
step SB1 comprises the following specific steps:
SB11, gray processing is carried out on the image acquired by the image acquisition equipment;
SB12, using Gaussian filter processing to the image after gray processing;
SB13, performing bias guide processing on each pixel point by using a black plug matrix on the Gaussian filtered image, so as to find edge points in the image;
SB14, screening edge points to obtain final characteristic points;
SB15, giving the main direction of the final feature point.
2. The full environment positioning method integrating inertial technology and visual modeling according to claim 1, wherein the method comprises the following steps: the clock synchronization in step S3 is achieved by continuously converging the deviation to zero in a successive optimization.
3. The full environment positioning method integrating inertial technology and visual modeling according to claim 1, wherein the method comprises the following steps: the unifying of the coordinate system in step S3 means: the transformation matrix is obtained by the coordinate system of the known inertial measurement unit and the coordinate system of the image acquisition device, and then the coordinate is unified by the transformation matrix.
4. The full environment positioning method integrating inertial technology and visual modeling according to claim 1, wherein the method comprises the following steps: the errors in step SA3 include a velocity error, a position error, and an attitude error.
CN202010284018.4A 2020-04-13 2020-04-13 Full-environment positioning method integrating inertial technology and visual modeling Active CN111595332B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010284018.4A CN111595332B (en) 2020-04-13 2020-04-13 Full-environment positioning method integrating inertial technology and visual modeling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010284018.4A CN111595332B (en) 2020-04-13 2020-04-13 Full-environment positioning method integrating inertial technology and visual modeling

Publications (2)

Publication Number Publication Date
CN111595332A CN111595332A (en) 2020-08-28
CN111595332B true CN111595332B (en) 2023-05-09

Family

ID=72184968

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010284018.4A Active CN111595332B (en) 2020-04-13 2020-04-13 Full-environment positioning method integrating inertial technology and visual modeling

Country Status (1)

Country Link
CN (1) CN111595332B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112085794B (en) * 2020-09-11 2022-05-17 中德(珠海)人工智能研究院有限公司 Space positioning method and three-dimensional reconstruction method applying same
CN112747746A (en) * 2020-12-25 2021-05-04 珠海市一微半导体有限公司 Point cloud data acquisition method based on single-point TOF, chip and mobile robot
CN112862818B (en) * 2021-03-17 2022-11-08 合肥工业大学 Underground parking lot vehicle positioning method combining inertial sensor and multi-fisheye camera

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019062291A1 (en) * 2017-09-29 2019-04-04 歌尔股份有限公司 Binocular vision positioning method, device, and system
CN109752003A (en) * 2018-12-26 2019-05-14 浙江大学 A kind of robot vision inertia dotted line characteristic positioning method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10062010B2 (en) * 2015-06-26 2018-08-28 Intel Corporation System for building a map and subsequent localization
CN107255476B (en) * 2017-07-06 2020-04-21 青岛海通胜行智能科技有限公司 Indoor positioning method and device based on inertial data and visual features
CN208751577U (en) * 2018-09-20 2019-04-16 江阴市雷奥机器人技术有限公司 A kind of robot indoor locating system
CN109141433A (en) * 2018-09-20 2019-01-04 江阴市雷奥机器人技术有限公司 A kind of robot indoor locating system and localization method
CN109520476B (en) * 2018-10-24 2021-02-19 天津大学 System and method for measuring dynamic pose of rear intersection based on inertial measurement unit
CN110044354B (en) * 2019-03-28 2022-05-20 东南大学 Binocular vision indoor positioning and mapping method and device
CN110095116A (en) * 2019-04-29 2019-08-06 桂林电子科技大学 A kind of localization method of vision positioning and inertial navigation combination based on LIFT

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019062291A1 (en) * 2017-09-29 2019-04-04 歌尔股份有限公司 Binocular vision positioning method, device, and system
CN109752003A (en) * 2018-12-26 2019-05-14 浙江大学 A kind of robot vision inertia dotted line characteristic positioning method and device

Also Published As

Publication number Publication date
CN111595332A (en) 2020-08-28

Similar Documents

Publication Publication Date Title
CN111595332B (en) Full-environment positioning method integrating inertial technology and visual modeling
CN106679648B (en) Visual inertia combination SLAM method based on genetic algorithm
CN109506642B (en) Robot multi-camera visual inertia real-time positioning method and device
CN104704384B (en) Specifically for the image processing method of the positioning of the view-based access control model of device
CN102607526B (en) Target posture measuring method based on binocular vision under double mediums
Li et al. Real-time motion tracking on a cellphone using inertial sensing and a rolling-shutter camera
CN106525074B (en) A kind of compensation method, device, holder and the unmanned plane of holder drift
Li et al. Real-time 3D motion tracking and reconstruction system using camera and IMU sensors
CN108810473B (en) Method and system for realizing GPS mapping camera picture coordinate on mobile platform
CN107909614B (en) Positioning method of inspection robot in GPS failure environment
CN109596121B (en) Automatic target detection and space positioning method for mobile station
CN109598794A (en) The construction method of three-dimension GIS dynamic model
CN110388919B (en) Three-dimensional model positioning method based on feature map and inertial measurement in augmented reality
CN107767425A (en) A kind of mobile terminal AR methods based on monocular vio
Li et al. Large-scale, real-time 3D scene reconstruction using visual and IMU sensors
CN105352509A (en) Unmanned aerial vehicle motion target tracking and positioning method under geographic information space-time constraint
CN112833892B (en) Semantic mapping method based on track alignment
CN109871739B (en) Automatic target detection and space positioning method for mobile station based on YOLO-SIOCTL
US11935249B2 (en) System and method for egomotion estimation
CN108958462A (en) A kind of methods of exhibiting and device of virtual objects
CN113516692A (en) Multi-sensor fusion SLAM method and device
CN110675453A (en) Self-positioning method for moving target in known scene
CN114708293A (en) Robot motion estimation method based on deep learning point-line feature and IMU tight coupling
CN114964276A (en) Dynamic vision SLAM method fusing inertial navigation
CN116772844A (en) Navigation method of visual inertial indoor robot based on dynamic environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: No. 106-6, Dongliufang, Yinzhou District, Ningbo City, Zhejiang Province, 315000

Patentee after: Zhejiang shenxun Technology Co.,Ltd.

Country or region after: China

Address before: 315194 room 209-2, building B, science and technology information Incubation Park, No. 655, bachelor Road, Yinzhou District, Ningbo City, Zhejiang Province

Patentee before: Ningbo shenxun Information Technology Co.,Ltd.

Country or region before: China