WO2022228391A1 - 一种终端设备定位方法及其相关设备 - Google Patents

一种终端设备定位方法及其相关设备 Download PDF

Info

Publication number
WO2022228391A1
WO2022228391A1 PCT/CN2022/089007 CN2022089007W WO2022228391A1 WO 2022228391 A1 WO2022228391 A1 WO 2022228391A1 CN 2022089007 W CN2022089007 W CN 2022089007W WO 2022228391 A1 WO2022228391 A1 WO 2022228391A1
Authority
WO
WIPO (PCT)
Prior art keywords
terminal device
pose
image frame
map
current image
Prior art date
Application number
PCT/CN2022/089007
Other languages
English (en)
French (fr)
Inventor
薛常亮
李和平
温丰
张洪波
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22794861.9A priority Critical patent/EP4322020A1/en
Publication of WO2022228391A1 publication Critical patent/WO2022228391A1/zh
Priority to US18/494,547 priority patent/US20240062415A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a terminal device positioning method and related devices.
  • the current image frame captured by the terminal device can be obtained, and from the preset vector map (in this map, the objects in the traffic environment can be represented by map points, for example, the light pole through the map The straight line formed by the points is represented, the sign is represented by the rectangular frame formed by the map points, etc.) to obtain the map points that match the feature points used to present objects in the traffic environment in the current image frame, and finally according to the feature points
  • the matching result with the map point determines the positioning result of the terminal device in the vector map.
  • Embodiments of the present application provide a terminal device positioning method and related devices, which can improve the accuracy of a terminal device positioning result.
  • a first aspect of the embodiments of the present application provides a method for locating a terminal device, and the method includes:
  • the traffic environment can be photographed by the camera at the current moment to obtain the current image frame. Further, the terminal device may also acquire other image frames before the current image frame. Then, the terminal device can position itself according to the current image frame and other image frames.
  • the terminal device first obtains, from the vector map, the first map point that matches the first feature point of the current image frame.
  • the feature points used to represent traffic lights in the current image frame and the map points used to represent traffic lights in the vector map are matching points
  • the feature points used to represent lane lines in the current image frame and the vector map used to represent lanes are matching points. Line map points for matching points and so on.
  • the terminal device may also obtain, from the vector map, second map points that match the second feature points of other image frames before the current image frame.
  • the terminal device can construct an objective function according to the first matching error between the first feature point and the first map point and the second matching error between the second feature point and the second map point, and according to the objective function Adjust the pose when the terminal device shoots the current image frame, that is, optimize the pose when the terminal device shoots the current image frame according to the objective function, until the objective function converges, so as to obtain the current adjusted (optimized) terminal device shooting current.
  • the pose of the image frame which is used as the positioning result of the terminal device in the vector map.
  • the pose of the terminal device when the current image frame is captured generally refers to the pose of the terminal device in the three-dimensional coordinate system corresponding to the vector map when the current image frame is captured.
  • the first map point matching the first feature point of the current image frame and the current image frame can be obtained from the vector map.
  • the second map point that matches the second feature point of other image frames before the frame.
  • the target function is passed through the target function.
  • the function adjusts the pose when the terminal device captures the current image frame. It not only considers the impact of the current image frame on the optimization process of the pose when the terminal device captures the current image frame, but also considers the impact of other image frames on the current image frame captured by the terminal device.
  • the influence caused by the optimization process of the pose of the image frame that is, the correlation between the current image frame and other image frames is considered, and the factors considered are more comprehensive. Therefore, the positioning result of the terminal device obtained in this way has higher accuracy.
  • the method further includes: acquiring the pose of the terminal device when shooting the current image frame and the last adjusted pose of the terminal device when shooting other image frames, and comparing the current image frame and the current image Semantic detection is performed on other image frames before the frame, so as to obtain the first feature point of the current image frame and the second feature point of other image frames before the current image frame. Then, the first map point matching the first feature point can be obtained from the vector map according to the pose of the terminal device when the current image frame was captured, and the pose of the last adjusted terminal device when capturing other image frames, Obtain the second map point that matches the second feature point from the vector map. In this way, the correlation matching between feature points and map points can be completed.
  • adjusting the pose when the terminal device shoots the current image frame according to the objective function, and obtaining the currently adjusted pose when the terminal device shoots the current image frame includes: obtaining the first feature point After the position in the first coordinate system and the position of the first map point in the first coordinate system, according to the difference between the position of the first feature point in the first coordinate system and the position of the first map point in the first coordinate system The initial value of the first matching error can be obtained by calculating the distance between them.
  • the objective function is iteratively solved until the preset iterative conditions are met, and the pose of the currently adjusted terminal device when shooting the current image frame is obtained. .
  • the relationship between the first feature point and the first map point can be calculated.
  • the frame and other image frames adjust the pose of the terminal device when shooting the current image frame, and the factors considered are more comprehensive, so that the positioning result of the terminal device can be accurately obtained.
  • the distance between the position of the first feature point in the first coordinate system and the position of the first map point in the first coordinate system includes at least one of the following: (1) the first feature The distance between the position of the point in the current image frame and the position of the first map point in the current image frame, where the position of the first map point in the current image frame is based on the three-dimensional map corresponding to the first map point in the vector map. The position in the coordinate system and the pose when the terminal device captures the current image frame are obtained.
  • the distance between the position of the second feature point in the second coordinate system and the position of the second map point in the second coordinate system includes at least one of the following: (1) the second feature The distance between the position of the point in other image frames and the position of the second map point in other image frames, wherein the position of the second map point in other image frames is based on the three-dimensional corresponding to the second map point in the vector map. The position in the coordinate system and the pose of the last adjusted terminal device when shooting other image frames are obtained.
  • the iteration condition is: for any iteration, if the difference between the pose difference between frames obtained by the iteration and the pose difference between frames calculated by the terminal device is less than a preset threshold , the iteration is stopped, and the pose difference between frames obtained in this iteration is determined according to the pose obtained by this iteration when the terminal device captures the current image frame and the pose obtained by this iteration when the terminal device captures other image frames.
  • the difference in pose between the current image frame and other image frames is the pose difference between the two adjacent image frames captured by the terminal device; if the difference is greater than or equal to the threshold, the next iteration will be performed until the number of iterations is equal to the predetermined value. number of settings.
  • the number of other image frames may vary with the change of the motion state of the terminal device. Specifically, the number of other image frames may be determined according to the speed of the terminal device.
  • acquiring the pose when the terminal device shoots the current image frame includes: according to the last adjusted pose when the terminal device shoots other image frames and the inter-frame pose difference calculated by the terminal device, calculating The predicted pose when the terminal device shoots the current image frame; the predicted pose when the terminal device shoots the current image frame is subjected to hierarchical sampling to obtain the pose when the terminal device shoots the current image frame.
  • the pose of the terminal device when shooting the current image frame obtained through hierarchical sampling can be used as the initial pose of the current adjustment, thereby improving the convergence speed and robustness of the current adjustment.
  • the predicted pose when the terminal device shoots the current image frame is subjected to hierarchical sampling to obtain the terminal device.
  • the pose when the device captures the current image frame includes: obtaining the position of the third map point in the three-dimensional coordinate system corresponding to the vector map and the position of the first feature point in the current image frame; maintaining the prediction when the terminal device captures the current image frame The heading angle of the pose remains unchanged, and the abscissa and ordinate of the predicted pose when the terminal device captures the current image frame is changed to obtain the first candidate pose; according to the first candidate pose, the third map point corresponding to the vector map Transform the position in the three-dimensional coordinate system to obtain the position of the third map point in the preset image coordinate system; keep the abscissa and ordinate of the predicted pose when the terminal device shoots the current image frame unchanged, change the terminal device to shoot The heading angle of the predicted pose at the current image frame is obtained to obtain the second candidate pose; the position of the first feature point in the current image frame is transformed according to the second candidate pose to obtain the first feature point in the image coordinate system position; according to the size of the distance between the position of the third map point in the
  • the prediction when the terminal device captures the current image frame includes: obtaining the position of the third map point in the three-dimensional coordinate system corresponding to the vector map and the position of the first feature point in the current image frame; maintaining The heading angle, roll angle, pitch angle and vertical coordinates of the predicted pose when the terminal device shoots the current image frame remain unchanged, and the abscissa and vertical coordinates of the predicted pose when the terminal device shoots the current image frame are changed to obtain the first candidate position transform the position of the third map point in the three-dimensional coordinate system corresponding to the vector map according to the first candidate pose to obtain the position of the third map point in the preset image coordinate system; keep the terminal device to shoot the current image frame
  • the candidate pose determines the pose when the terminal device captures the current image frame.
  • a second aspect of the embodiments of the present application provides an apparatus for locating a terminal device, the apparatus includes: a first matching module, configured to acquire, from a vector map, a first map point that matches a first feature point of a current image frame;
  • the second matching module is used to obtain, from the vector map, the second map points that match the second feature points of other image frames before the current image frame;
  • the optimization module is used to capture the current image frame by the terminal device according to the objective function.
  • the first map point matching the first feature point of the current image frame and the current image frame can be obtained from the vector map.
  • the second map point that matches the second feature point of other image frames before the frame.
  • the target function is passed through the target function.
  • the function adjusts the pose when the terminal device captures the current image frame. It not only considers the impact of the current image frame on the optimization process of the pose when the terminal device captures the current image frame, but also considers the impact of other image frames on the current image frame captured by the terminal device.
  • the influence caused by the optimization process of the pose of the image frame that is, the correlation between the current image frame and other image frames is considered, and the factors considered are more comprehensive. Therefore, the positioning result of the terminal device obtained in this way has higher accuracy.
  • the apparatus further includes: an acquisition module, configured to acquire the first feature point of the current image frame, the second feature point of other image frames before the current image frame, and the time when the terminal device captures the current image frame. and the last adjusted pose of the terminal device when shooting other image frames; the first matching module is used to obtain from the vector map according to the pose of the terminal device when shooting the current image frame and the first feature point The matched first map point; the second matching module is used to obtain the second map point matched with the second feature point from the vector map according to the pose of the last adjusted terminal device when shooting other image frames.
  • an acquisition module configured to acquire the first feature point of the current image frame, the second feature point of other image frames before the current image frame, and the time when the terminal device captures the current image frame. and the last adjusted pose of the terminal device when shooting other image frames
  • the first matching module is used to obtain from the vector map according to the pose of the terminal device when shooting the current image frame and the first feature point The matched first map point
  • the second matching module is used to obtain the second map
  • the optimization module is configured to: perform calculation according to the distance between the position of the first feature point in the first coordinate system and the position of the first map point in the first coordinate system, to obtain the first an initial value of the matching error; calculate according to the distance between the position of the second feature point in the second coordinate system and the position of the second map point in the second coordinate system to obtain the initial value of the second matching error; according to For the initial value of the first matching error and the initial value of the second matching error, the objective function is iteratively solved until the preset iterative conditions are met, and the pose of the currently adjusted terminal device when the current image frame is captured is obtained.
  • the distance between the position of the first feature point in the first coordinate system and the position of the first map point in the first coordinate system includes at least one of the following: the first feature point is in the current The position in the image frame, and the distance between the position of the first map point in the current image frame; the position of the first feature point in the three-dimensional coordinate system corresponding to the vector map, and the first map point in the three-dimensional coordinate system corresponding to the vector map.
  • the distance between the position of the second feature point in the second coordinate system and the position of the second map point in the second coordinate system includes at least one of the following: The position in the image frame, the distance between the position of the second map point in other image frames; the position of the second feature point in the three-dimensional coordinate system corresponding to the vector map, and the position of the second map point in the three-dimensional coordinate system corresponding to the vector map.
  • the iteration condition is: for any iteration, if the difference between the pose difference between frames obtained by the iteration and the pose difference between frames calculated by the terminal device is less than a preset threshold , the iteration is stopped, and the pose difference between frames obtained in this iteration is determined according to the pose obtained by this iteration when the terminal device captures the current image frame and the pose obtained by this iteration when the terminal device captures other image frames.
  • the difference in pose between the current image frame and other image frames is the pose difference between the two adjacent image frames captured by the terminal device; if the difference is greater than or equal to the threshold, the next iteration will be performed until the number of iterations is equal to the predetermined value. number of settings.
  • the number of other image frames is determined according to the speed of the terminal device.
  • an acquiring module is configured to: calculate the time when the terminal device shoots the current image frame according to the last adjusted pose when the terminal device shoots other image frames and the inter-frame pose difference calculated by the terminal device The predicted pose is obtained; the predicted pose when the terminal device shoots the current image frame is subjected to hierarchical sampling to obtain the pose when the terminal device shoots the current image frame.
  • the acquisition module is used to: acquire the three-dimensional coordinate system corresponding to the third map point in the vector map and the position of the first feature point in the current image frame; keep the heading angle of the predicted pose when the terminal device shoots the current image frame unchanged, and change the abscissa and the predicted pose when the terminal device shoots the current image frame.
  • the ordinate is to obtain the first candidate pose; the position of the third map point in the three-dimensional coordinate system corresponding to the vector map is transformed according to the first candidate pose to obtain the position of the third map point in the preset image coordinate system ; Keep the abscissa and ordinate of the predicted pose when the terminal device shoots the current image frame unchanged, change the heading angle of the predicted pose when the terminal device shoots the current image frame, and obtain the second candidate pose; According to the second candidate pose Transform the position of the first feature point in the current image frame to obtain the position of the first feature point in the image coordinate system; according to the position of the third map point in the image coordinate system and the first feature point in the image coordinate system The size of the distance between the positions in , determines the pose when the terminal device captures the current image frame from the combination of the first candidate pose and the second candidate pose.
  • the obtaining module is used to: obtain the third map The position of the point in the three-dimensional coordinate system corresponding to the vector map and the position of the first feature point in the current image frame; keep the heading angle, roll angle, pitch angle and vertical coordinate of the predicted pose when the terminal device captures the current image frame different.
  • a third aspect of the embodiments of the present application provides an apparatus for locating terminal equipment, the apparatus includes a memory and a processor; the memory stores code, and the processor is configured to execute the code, and when the code is executed, the apparatus for locating terminal equipment executes the following: The method described in the first aspect or any possible implementation manner of the first aspect.
  • a fourth aspect of the embodiments of the present application provides a vehicle, where the vehicle includes the terminal device positioning apparatus described in the third aspect.
  • a fifth aspect of the embodiments of the present application provides a computer storage medium, where the computer storage medium stores a computer program, and when the program is executed by a computer, the computer can implement the first aspect or any possible implementation manner of the first aspect the method described.
  • a sixth aspect of the embodiments of the present application provides a computer program product, where the computer program product stores instructions, and when the instructions are executed by a computer, the instructions cause the computer to implement the first aspect or any possible implementation manner of the first aspect. method described.
  • the first map point matching the first feature point of the current image frame and the current image frame can be obtained from the vector map.
  • the second map point that matches the second feature point of the previous other image frame.
  • the target function is passed through the target function.
  • the function adjusts the pose when the terminal device captures the current image frame. It not only considers the impact of the current image frame on the optimization process of the pose when the terminal device captures the current image frame, but also considers the impact of other image frames on the current image frame captured by the terminal device.
  • the influence caused by the optimization process of the pose of the image frame that is, considering the correlation between the current image frame and other image frames, the factors considered are more comprehensive, so the positioning result of the terminal device obtained in this way, with higher accuracy.
  • Fig. 1 is a schematic diagram of a vector map
  • FIG. 2 is a schematic flowchart of a method for locating a terminal device according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of a three-dimensional coordinate system corresponding to a terminal device provided by an embodiment of the present application
  • FIG. 4 is a schematic diagram of a pose difference between frames provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a first feature point of a current image frame provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of calculating the overlap degree provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a terminal device positioning apparatus provided by an embodiment of the present application.
  • FIG. 8 is another schematic structural diagram of a terminal device positioning apparatus provided by an embodiment of the present application.
  • Embodiments of the present application provide a terminal device positioning method and related devices, which can improve the accuracy of a terminal device positioning result.
  • the embodiments of the present application may be implemented by a terminal device, for example, a vehicle-mounted device on a car, an unmanned aerial vehicle, a robot, and the like.
  • a terminal device for example, a vehicle-mounted device on a car, an unmanned aerial vehicle, a robot, and the like.
  • the in-vehicle device on a car is simply referred to as a car in the following, and a moving car is used as an example for introduction.
  • Figure 1 is a schematic diagram of the vector map.
  • the vector map can display the virtual traffic environment where the car is currently located, and the traffic environment includes the car Various objects around, such as traffic lights, light poles, signs, lane lines, etc. These objects can be represented by pixel points on the vector map, that is, represented by map points in the vector map.
  • a light pole can be represented by a straight line formed by a plurality of map points
  • a sign can be represented by a rectangular box formed by a plurality of map points, and so on.
  • the virtual traffic environment displayed by the vector map is drawn according to the traffic environment in the real world, and the pose of the car displayed by the vector map is generally obtained by the car through calculation, which may be different from the car in the real world. There are differences in the real poses. Therefore, it is necessary to correct and optimize the poses of the cars in the vector map, so as to improve the accuracy of the positioning results of the cars. It can be understood that the pose of the car usually includes the position of the car and the orientation of the car, which will not be repeated in the following.
  • the car can shoot the current image frame, which is used to present the real traffic environment where the car is at the current moment. Then, the car can match the feature points of the current image frame with the map points of the vector map, which is equivalent to matching the real traffic environment where the car is located with the virtual traffic environment where the car is located. Finally, according to the matching results between the feature points of the current image frame and the map points of the vector map, for example, the matching error between the two, etc., adjust the pose of the car in the vector map, and use the optimized car The pose is used as the positioning result of the car.
  • the positioning result of the car is determined only by the current image frame, and the factors considered are relatively single, resulting in a low accuracy of the positioning result of the car.
  • FIG. 2 is a schematic flowchart of a method for locating a terminal device according to an embodiment of the present application. As shown in FIG. 2 , the method includes:
  • the terminal device has a camera. During the moving process, the terminal device can photograph the current traffic environment through the camera to obtain the current image frame. Further, the terminal device can also obtain other image frames before the current image frame, and the number of other image frames can be determined according to the speed of the terminal device, as shown in formula (1):
  • t is the number of the current image frame and other image frames
  • t-1 is the number of other image frames
  • t 0 is the preset threshold
  • is the preset adjustment coefficient
  • v is the current moment of the terminal device. speed.
  • the terminal device can obtain the pose of the current image frame and the poses of the other image frames after the last optimization.
  • the pose of the current image frame can be optimized for the current time according to the current image frame and other image frames, so as to obtain the pose of the current image frame after the current optimization.
  • pose and the poses of other image frames after the current suboptimization are the results obtained after the last optimization of the poses of the other image frames is performed according to the other image frames.
  • the pose of the current image frame can be obtained in the following manner: First, the predicted pose of the current image frame is calculated according to the poses of other image frames after the last optimization and the inter-frame pose difference calculated by the terminal device. Then, perform hierarchical sampling on the predicted pose of the current image frame to obtain the pose of the current image frame.
  • the terminal device may also have an odometer, and the odometer may construct a three-dimensional coordinate system corresponding to the terminal device, for example, a vehicle body coordinate system and the like.
  • FIG. 3 is a schematic diagram of a three-dimensional coordinate system corresponding to a terminal device provided by an embodiment of the present application. As shown in FIG. 3 , in the three-dimensional coordinate system, the origin is the starting point of movement of the terminal device, and the X-axis points to the starting point of movement. Right in front of the terminal device, the Y axis points to the left side of the terminal device when it is at the starting point of the movement, and the Z axis can be zero by default.
  • the pose difference corresponding to the image frame is the pose difference between two adjacent image frames, which can also be called the pose difference between frames.
  • the pose difference between frames can be expressed by formula (2):
  • ⁇ T is the pose difference between frames
  • ⁇ R is the rotation between two adjacent image frames
  • ⁇ t is the translation between two adjacent image frames
  • FIG. 4 is a schematic diagram of a pose difference between frames provided by an embodiment of the present application.
  • the current image frame and other image frames before the current image frame are t frames in total.
  • F 1 represents the first image frame in other image frames
  • F 2 represents the second image frame in other image frames
  • F t-1 represents the last image frame in other image frames (that is, the current image frame The previous image frame of the frame)
  • F t represents the current image frame.
  • the odometer can calculate the pose difference between F 1 and F 2 as ⁇ T t-1 ,..., and obtain the pose difference between F t-1 and F t as ⁇ T t-1 . Then, the predicted pose of the current image frame can be calculated by formula (3):
  • P t is the predicted pose of the current image frame
  • P t-1 is the pose of the previous image frame after the last optimization.
  • the predicted pose of the current image frame can also be calculated by formula (4):
  • P tm is the pose of the tm-th image frame in the other image frames after the last optimization.
  • the predicted pose of the current image frame After the predicted pose of the current image frame is obtained, the predicted pose can be sampled in layers, so as to obtain the pose of the current image frame, that is, the initial value of the pose used for the current sub-optimization. Specifically, the predicted pose of the current image frame can be sampled hierarchically in various ways, which will be introduced separately below:
  • the layered sampling process includes: (1) It can be advanced in the vector map Part of the map points are arbitrarily selected within the delimited range as the third map points, and the position of the third map point in the three-dimensional coordinate system corresponding to the vector map and the position of the first feature point in the current image frame are obtained. It can be understood that the position of the third map point in the three-dimensional coordinate system corresponding to the vector map is a three-dimensional coordinate, and the position of the first feature point in the current image frame is a two-dimensional coordinate.
  • This process is equivalent to projecting the first feature point to the image coordinates in the department.
  • (6) According to the size of the distance between the position of the third map point in the image coordinate system and the position of the first feature point in the image coordinate system, determine from the combination of the first candidate pose and the second candidate pose The pose of the current image frame. Through the aforementioned pose sampling method, the amount of computation required in the pose sampling process can be effectively reduced.
  • N1 ⁇ N2 ⁇ N3 new pose combinations are formed according to N1 ⁇ N2 groups of new third map points and N3 groups of new first feature points, and each combination is calculated
  • the distance between the third map point and the first feature point in the middle, N1 ⁇ N2 ⁇ N3 distances can be obtained, and the smallest distance is selected from them, and the abscissa and ordinate of the first candidate pose corresponding to this distance are used. and the heading angle of the second candidate pose corresponding to the distance to form the pose of the current image frame.
  • the layered sampling The process includes: (1) obtaining the position of the third map point in the three-dimensional coordinate system corresponding to the vector map and the position of the first feature point in the current image frame. (2) Keep the heading angle, roll angle, pitch angle and vertical coordinates of the predicted pose of the current image frame unchanged, and change the abscissa and vertical coordinates of the predicted pose of the current image frame to obtain the first candidate pose.
  • steps (1) to (9) can refer to steps (1) to (5) in the previous example , and will not be repeated here.
  • steps (1) to (5) can refer to steps (1) to (5) in the previous example , and will not be repeated here.
  • (6) In the preset image coordinate system, form N1 ⁇ N2 ⁇ N3 new combinations according to the N1 ⁇ N2 groups of new third map points and the N3 groups of new first feature points, and calculate the first For the distance between the three map points and the first feature point, N1 ⁇ N2 ⁇ N3 distances can be obtained, the smallest distance is selected from them, and the abscissa and ordinate of the first candidate pose corresponding to this distance, and the distance The heading angle of the corresponding second candidate pose constitutes the third candidate pose.
  • each fourth candidate pose project the third map point in the vector map to the current image frame to obtain N4 ⁇ N5 groups of new third map points.
  • N4 ⁇ N5 groups of new third map points and the first feature points in the current image frame construct N4 ⁇ N5 new combinations, and calculate the third map point and the first feature point in each combination.
  • the distance between a feature point, N4 ⁇ N5 distances can be obtained, the smallest distance is selected, and the pitch angle and vertical coordinates of the fourth candidate pose corresponding to this distance, and the abscissa of the third candidate pose, The ordinate, heading angle and roll angle constitute the pose of the current image frame.
  • semantic detection can also be performed on the current image frame and other image frames, so as to obtain the first feature point of the current image frame and other image frames.
  • Second feature point Specifically, semantic detection processing can be performed on the current image frame and other image frames through a neural network, that is, feature extraction is performed, so as to obtain the first feature point of the current image frame and the second feature point of other image frames.
  • the first feature point and the second feature point can be understood as semantic identification on the image.
  • the first feature point of the current image frame includes the feature points of various objects in the traffic environment
  • FIG. 5 is a schematic diagram of the first feature point of the current image frame provided by the embodiment of the application, as shown in FIG.
  • the feature points of the light poles and the feature points of the lane lines can be the pixel points at the two ends, the feature points of the traffic lights and the feature points of the sign can be a rectangular frame (ie an outer bounding box) that is a collection of multiple pixel points, etc. Wait.
  • the second feature points of other image frames are also the same, which will not be repeated here.
  • the aforementioned neural network is a trained neural network model. The following will briefly introduce the training process of the neural network:
  • each image frame to be trained can be input to the model to be trained.
  • the feature points of each image frame to be trained are obtained through the model to be trained, and these feature points are predicted feature points.
  • the difference between the feature points of each image frame to be trained and the real feature points of the corresponding image frame is calculated by the target loss function. , it is regarded as a qualified image frame to be trained, and if it is outside the qualified range, it is regarded as an unqualified image frame to be trained.
  • the pose of the current image frame generally refers to the pose of the terminal device in the three-dimensional coordinate system corresponding to the vector map when the current image frame is captured.
  • the pose of other image frames Refers to the pose of the terminal device in the three-dimensional coordinate system corresponding to the vector map when shooting other image frames, etc.
  • process of the last optimization may refer to the process of the current sub-optimization
  • process of the next optimization may also refer to the process of the current sub-optimization and so on.
  • the pose of the first image frame can be acquired through a global positioning system (global positioning system, GPS) of the terminal device, as the object of the first optimization.
  • GPS global positioning system
  • the initial value of the pose of the current image frame for the current sub-optimization is obtained.
  • the vector map preset in the terminal device can be obtained from the vector map that matches the first feature.
  • the first map point matching the first feature point can be obtained in various ways, which will be introduced separately below:
  • an area including terminal devices can be delineated in the vector map, for example, a range of 150m ⁇ 150m, and multiple map points in the area are located in the area according to the pose of the current image frame.
  • the position in the three-dimensional coordinate system corresponding to the vector map is calculated by coordinate transformation to obtain the position of this part of the map point in the current image frame. This process is equivalent to the multiple map points in the area according to the pose of the current image frame. Projected into the current image frame.
  • the nearest neighbor algorithm can be used to analyze the first feature point and this part of the The position of the map point in the current image frame is calculated to match the first feature point and this part of the map points between similar objects, so as to determine the first map matching the first feature point in this part of the map points point.
  • an object such as a light pole can be represented by a straight line formed by a collection of multiple map points, which is still a straight line projected into the current image frame, which is called a projected straight line hereinafter.
  • Objects such as light poles are represented by feature points at two ends in the current image frame, and the feature points at the ends will be referred to as endpoints later.
  • the two The average value of the distances from the endpoints to the projected straight line of pole A, the average of the distances from the two endpoints of pole D to the projected straight line of pole B, and the average of the distances from the two endpoints of pole D to the projected straight line of pole C
  • the average value of the distance is determined, and the light pole with the smallest average value is determined as the light pole that matches the light pole D, and the map point of the light pole matches the feature point of the light pole D.
  • objects such as signs (or traffic lights) can be represented by a rectangular frame assembled from multiple map points.
  • the rectangular frame projected into the current image frame is still a rectangular frame, and the indicator Objects such as cards are also represented in the current image frame by a rectangular frame formed by a collection of multiple feature points.
  • sign X and sign Y in the vector map projected to the current image frame in order to determine which sign matches the sign Z in the current image frame, the four vertices on the rectangular frame of sign Z can be calculated.
  • the average value of the distances to the projected straight lines of the two parallel sides in the rectangular frame of the sign X, and the distance from the four vertices of the rectangular frame of the sign Z to the projected straight lines of the two parallel sides of the rectangular frame of the sign Y The average value is determined, and the indicator with the smallest average value is determined as the indicator that matches the indicator Z, and the map point of the indicator matches the characteristic point of the indicator Z.
  • objects such as lane lines can be represented by a straight line formed by a collection of multiple map points.
  • the straight line is projected to the current image frame as a straight line, while objects such as lane lines are displayed in the current image frame.
  • the current image frame is represented by the feature points at the two ends.
  • the two end points of the lane line G can be calculated to the lane line
  • the average value of the distances of the projected straight lines of E, and the degree of overlap between the projected straight lines of the lane line G and the projected straight lines of the lane line E, and the average value of the distances from the two endpoints of the lane line G to the projected straight lines of the lane line F, and The degree of overlap between the projected straight lines of the lane line G and the lane line F, and the distance and the overlap degree are used as the comprehensive distance, and the comprehensive distance is minimized (for example, if the overlap degree of the lane line E and the overlap degree of the lane line F are the same, then The lane line with the smaller distance is the lane line with the smaller comprehensive distance, etc.)
  • the lane line is determined as the lane line matching the lane line G, and the map point of
  • FIG. 6 is a schematic diagram of calculating the overlap degree provided by the embodiment of the present application, and is provided with the lane line JK of the current image frame and the projected straight line PQ of the lane line of the vector map,
  • the vertical foot of the endpoint J on the projected straight line PQ is U
  • the vertical foot of the endpoint K on the projected straight line PQ is V
  • l overlap is the degree of overlap
  • d UV is the length of the line segment UV
  • d UV ⁇ PQ is the length of the overlapping portion between the line segment UV and the line segment PQ.
  • an area containing terminal equipment may be delineated in the vector map. Then, according to the pose of the current image frame, a coordinate transformation calculation is performed on the position of the first feature point of the current image frame in the current image frame, so as to obtain the position of the first feature point in the three-dimensional coordinate system corresponding to the vector map, This process is equivalent to projecting the first feature point of the current image frame into the three-dimensional coordinates corresponding to the vector map according to the pose of the current image frame. Since the first feature point of the current image frame includes the feature points of various objects, and multiple map points in this area of the vector map also include the map points of various objects, the nearest neighbor algorithm can be used to determine the first feature point.
  • this part of the map points in the three-dimensional coordinates corresponding to the vector map is calculated to match the first feature point and this part of the map points between similar objects, so that in this part of the map points, determine the first feature point and the first feature point.
  • the point matches the first map point.
  • an area including the terminal device may be delineated in the vector map, and according to the pose of the current image frame, multiple map points in the area are placed in the three-dimensional coordinate system corresponding to the vector map , and perform coordinate transformation calculation to obtain the position of this part of the map points in the three-dimensional coordinate system corresponding to the terminal device.
  • the position of the first feature point of the current image frame in the current image frame can be calculated by coordinate transformation, so as to obtain the three-dimensional coordinate system corresponding to the first feature point in the terminal device. in the location.
  • the nearest neighbor algorithm can be used to determine the first feature point. And the position of this part of the map points in the three-dimensional coordinates corresponding to the terminal device is calculated, so that the first feature point and this part of the map point are matched between similar objects, so that in this part of the map points, it is determined that the first feature The point matches the first map point.
  • the feature points of all objects in the current image frame and the map points of all objects in the divided area of the vector map are set in a certain coordinate system, so as to complete the matching between the feature points and the map points .
  • the feature points and map points of certain types of objects for example, traffic lights, light poles, signs, etc.
  • can also be set in a certain coordinate system for example, the current image frame
  • the feature points and map points of objects of a class for example, lane lines
  • are set in another coordinate system for example, a three-dimensional coordinate system corresponding to a terminal device for matching.
  • the distance between them includes at least one of the following: 1. The distance between the position of the first feature point in the current image frame and the position of the first map point in the current image frame. 2. The distance between the position of the first feature point in the three-dimensional coordinate system corresponding to the vector map and the position of the first map point in the three-dimensional coordinate system corresponding to the vector map. 3. The distance between the position of the first feature point in the three-dimensional coordinate system corresponding to the terminal device and the position of the first map point in the three-dimensional coordinate system corresponding to the terminal device.
  • the current image frame contains light pole W1, sign W2, lane line W3 and lane line W4, the light pole W5 in the vector map matches the light pole W1, the sign W6 in the vector map matches the sign W2, and the vector map
  • the lane line W7 in the vector map matches the lane line W3, and the lane line W8 in the vector map matches the lane line W4.
  • the distance between the position of the first feature point in the first coordinate system and the position of the first map point in the first coordinate system is the first feature point
  • the distance between the position in the current image frame and the position of the first map point in the current image frame includes: after being projected to the current image frame, the distance between the two end points of the light pole W1 and the projected straight line of the light pole W5 Average value, the average value of the distances from the four vertices on the rectangular frame of the sign W2 to the projected straight lines of the two parallel sides of the rectangular frame of the sign W6, the comprehensive distance between the lane line W3 and the lane line W7, and the lane line The combined distance between W4 and lane line W8.
  • Case 2 When the first coordinate system includes the current image frame and the three-dimensional coordinate system corresponding to the terminal device, the position of the first feature point in the first coordinate system is between the position of the first map point in the first coordinate system
  • the distance includes: the distance between the position of the first feature point in the current image frame and the position of the first map point in the current image frame, and the position of the first feature point in the three-dimensional coordinate system corresponding to the terminal device, and The distance between the positions of the first map point in the three-dimensional coordinate system corresponding to the terminal device.
  • the distance between the position of the first feature point in the current image frame and the position of the first map point in the current image frame includes: after being projected to the current image frame, the distance between the two end points of the light pole W1 and the distance between the two end points of the light pole W5 The average value of the distances of the projected straight lines, and the average value of the distances from the four vertices on the rectangular frame of the sign W2 to the projected straight lines of the two parallel sides of the rectangular frame of the sign W6.
  • the distance between the position of the first feature point in the three-dimensional coordinate system corresponding to the terminal device and the position of the first map point in the three-dimensional coordinate system corresponding to the terminal device includes: after projecting to the three-dimensional coordinate system corresponding to the terminal device, the lane The combined distance between line W3 and lane line W7, and the combined distance between lane line W4 and lane line W8.
  • Cases three when the first coordinate system is the three-dimensional coordinate system corresponding to the vector map
  • case four when the first coordinate system is the three-dimensional coordinate system corresponding to the terminal device
  • case five when the first coordinate system is the three-dimensional coordinate system corresponding to the terminal device
  • case six when the first coordinate system includes the three-dimensional coordinate system corresponding to the terminal device and the three-dimensional coordinate system corresponding to the vector map
  • case seven when the first coordinate system includes the three-dimensional coordinate system corresponding to the terminal device and the three-dimensional coordinate system corresponding to the vector map
  • the system includes the current image frame, the 3D coordinate system corresponding to the terminal device, and the 3D coordinate system corresponding to the vector map
  • the initial values of the poses of the other image frames used for the current optimization are obtained. Two feature points match the second map point.
  • the distance includes at least one of the following: 1. The distance between the position of the second feature point in other image frames and the position of the second map point in other image frames. 2. The distance between the position of the second feature point in the three-dimensional coordinate system corresponding to the vector map and the position of the second map point in the three-dimensional coordinate system corresponding to the vector map. 3. The distance between the position of the second feature point in the three-dimensional coordinate system corresponding to the terminal device and the position of the second map point in the three-dimensional coordinate system corresponding to the terminal device.
  • the distance between the position of the second feature point in the second coordinate system and the position of the second map point in the second coordinate system can also refer to step 202, the first feature point is in the first coordinate system
  • the relevant description part of the distance between the position of the first map point in the first coordinate system and the position of the first map point in the first coordinate system will not be repeated here.
  • the objective function includes the first feature point and the first map point. a first matching error, and a second matching error between the second feature point and the second map point.
  • the first matching error between the first feature point and the first map point and the second feature point After the first map point matching the first feature point and the second map point matching the second feature point are obtained, the first matching error between the first feature point and the first map point and the second feature
  • the objective function constructed by the second matching error between the point and the second map point adjusts the pose of the current image frame, that is, optimizes the pose of the current image frame, and obtains the position of the current image frame after the current suboptimization. pose, as the positioning result of the terminal device.
  • the initial value of the first matching error may be obtained according to the distance between the position of the first feature point in the first coordinate system and the position of the first map point in the first coordinate system. Still as described in the above example, the initial value of the first matching error can be obtained by formula (6):
  • the first matching error is determined by Huber ⁇ 1 and Huber ⁇ 2
  • Huber ⁇ 1 is the Huber loss function whose parameter is ⁇ 1
  • Huber ⁇ 2 is the Huber loss function whose parameter is ⁇ 2
  • is a preset parameter
  • d pp is the current image
  • the distance corresponding to objects such as light poles in the frame is the distance between the i-th light pole and the matching light pole in the current image frame, is the distance from the two end points of the i-th light pole to the projected straight line of the matching light pole
  • d pl is the distance corresponding to the two types of objects of traffic lights (or signs) in the current image frame
  • d pH is the comprehensive distance corresponding to objects such
  • calculation can also be performed according to the distance between the position of the second feature point in the second coordinate system and the position of the second map point in the second coordinate system to obtain the initial value of the second matching error.
  • the initial value of the second matching error can also be obtained by formula (6), which will not be repeated here.
  • this part of the initial value can be input into the objective function, and the objective function can be iteratively solved until the preset iterative conditions are met, and the current optimized value is obtained.
  • the pose of the current image frame Based on formula (6), the objective function can be expressed by formula (7):
  • the distance corresponding to objects such as light poles in the i-th image frame is the distance corresponding to the two types of objects of traffic lights (or signs) in the i-th image frame, is the distance corresponding to objects such as lane lines in the ith image frame.
  • the initial value of the first matching error and the initial value of the second matching error are input into the objective function to solve, and the current value obtained by the first iteration can be obtained.
  • the pose of the image frame and the poses of other image frames obtained from the first iteration are input into the objective function to solve, and the current value obtained by the first iteration can be obtained.
  • the pose of the image frame and the poses of other image frames obtained from the first iteration are input into the objective function to solve, and the current value obtained by the first iteration.
  • the iteration is stopped, and the pose of the current image frame obtained in the first iteration is used as the The pose of the current image frame after the current optimization, if the difference between the pose difference between the frames and the pose difference between the frames calculated by the odometer of the terminal device is greater than or equal to the preset threshold, the second time iterate.
  • the first map point that matches the first feature point can be re-determined according to the pose of the current image frame obtained by the first iteration (ie, step 202 is re-executed), and obtained according to the first iteration
  • the poses of other image frames of determine the second map point that matches the second feature point again (ie, re-execute step 203 ). Then, the first iteration value of the first matching error between the first feature point and the first map point, and the first iteration value of the second matching error between the second feature point and the second map point are calculated.
  • the first iteration value of the first matching error and the first iteration value of the second matching error are output to the objective function to solve, and the pose of the current image frame obtained by the second iteration and the second iteration obtained poses of other image frames.
  • the pose of the image frame if the difference between the pose difference between the frames and the pose difference between the frames calculated by the odometer of the terminal device is greater than or equal to the preset threshold, the third iteration is performed until the number of iterations is equal to The preset number of times, at this time, the objective function is also considered to converge, and the pose of the current image frame obtained by the last iteration is used as the pose of the current image frame after the current suboptimization.
  • the first map point matching the first feature point of the current image frame and the The second map point that matches the second feature point of the other image frame.
  • the pose of the current image frame make adjustments to get the pose of the current image frame after the current suboptimization.
  • the objective function since the objective function includes both the matching error between the feature points of the current image frame and the map points of the vector map, as well as the matching errors between the feature points of other image frames and the map points of the vector map, the target function is passed through the target function.
  • the function adjusts the pose of the current image frame, not only considering the influence of the current image frame on the optimization process of the pose of the current image frame, but also the influence of other image frames on the optimization process of the pose of the current image frame. , that is, the correlation between the current image frame and other image frames is considered, and the factors considered are more comprehensive, so the positioning result of the terminal device obtained in this way has higher accuracy.
  • the objective function is only constructed by the matching error between the feature points of the current image frame and the map points of the vector map.
  • the map points match the feature points, the map points are often sparse and overlapped. Therefore, when the objective function is iteratively solved, the matching error between the feature points and the map points cannot be made small enough, which affects the positioning results. accuracy.
  • the objective function uses the first matching error between the first feature point of the current image frame and the first map point of the vector map, and the difference between the second feature point of other image frames and the second map point of the vector map.
  • the content presented by multiple image frames usually has a large degree of difference, it can avoid the situation that the map points are sparse and overlap, so when the objective function is iteratively solved (for many The pose of each image frame is jointly optimized), which can make the first matching error and the second matching error small enough, thereby improving the accuracy of the positioning result.
  • the pose of the current image frame obtained through hierarchical sampling can be used as the initial value of the pose of the current image frame for the current sub-optimization, thereby improving the convergence speed and robustness of the current sub-optimization.
  • FIG. 7 is a schematic structural diagram of a terminal equipment positioning apparatus provided by an embodiment of the present application. As shown in FIG. 7 , the apparatus includes:
  • the first matching module 701 is configured to obtain, from the vector map, a first map point that matches the first feature point of the current image frame captured by the terminal device;
  • the second matching module 702 is configured to obtain, from the vector map, a second map point that matches the second feature point of other image frames before the current image frame;
  • the adjustment module 703 is configured to adjust the pose when the terminal device shoots the current image frame according to the objective function, and obtain the pose of the current adjusted terminal device when shooting the current image frame, as the positioning result of the terminal device, and the objective function includes A first matching error between the first feature point and the first map point, and a second matching error between the second feature point and the second map point.
  • the apparatus further includes: an acquisition module 700, configured to acquire the first feature point of the current image frame, the second feature point of other image frames before the current image frame, and the terminal device shooting the current image frame and the last adjusted pose when the terminal device captured other image frames; the first matching module 701 is configured to obtain, from the vector map, the first matching module 701 according to the pose when the terminal device captured the current image frame. The first map point matched by the feature point; the second matching module 702 is used to obtain a second map matched with the second feature point from the vector map according to the pose of the last adjusted terminal device when shooting other image frames point.
  • an acquisition module 700 configured to acquire the first feature point of the current image frame, the second feature point of other image frames before the current image frame, and the terminal device shooting the current image frame and the last adjusted pose when the terminal device captured other image frames
  • the first matching module 701 is configured to obtain, from the vector map, the first matching module 701 according to the pose when the terminal device captured the current image frame.
  • the adjustment module 703 is configured to: calculate according to the distance between the position of the first feature point in the first coordinate system and the position of the first map point in the first coordinate system, to obtain The initial value of the first matching error; calculate according to the distance between the position of the second feature point in the second coordinate system and the position of the second map point in the second coordinate system, to obtain the initial value of the second matching error; According to the initial value of the first matching error and the initial value of the second matching error, the objective function is iteratively solved until the preset iterative conditions are met, and the pose of the current image frame after the current adjustment is obtained by the terminal device.
  • the distance between the position of the first feature point in the first coordinate system and the position of the first map point in the first coordinate system includes at least one of the following: the first feature point is in the current The position in the image frame, and the distance between the position of the first map point in the current image frame; the position of the first feature point in the three-dimensional coordinate system corresponding to the vector map, and the first map point in the three-dimensional coordinate system corresponding to the vector map.
  • the distance between the position of the second feature point in the second coordinate system and the position of the second map point in the second coordinate system includes at least one of the following: The position in the image frame, the distance between the position of the second map point in other image frames; the position of the second feature point in the three-dimensional coordinate system corresponding to the vector map, and the position of the second map point in the three-dimensional coordinate system corresponding to the vector map.
  • the iteration condition is: for any iteration, if the difference between the pose difference between frames obtained by the iteration and the pose difference between frames calculated by the terminal device is less than a preset threshold If If the difference is greater than or equal to the threshold, execute the next iteration until the number of iterations is equal to the preset number of times.
  • the number of other image frames is determined according to the speed of the terminal device.
  • the acquiring module 700 is configured to: calculate the current image frame captured by the terminal device according to the last adjusted pose of the terminal device when shooting other image frames and the inter-frame pose difference calculated by the terminal device The predicted pose when the terminal device shoots the current image frame is subjected to hierarchical sampling to obtain the pose when the terminal device shoots the current image frame.
  • the acquisition module 700 is used to: acquire the three-dimensional coordinates of the third map point corresponding to the vector map The position in the system and the position of the first feature point in the current image frame; keep the heading angle of the predicted pose when the terminal device captures the current image frame unchanged, and change the abscissa of the predicted pose when the terminal device captures the current image frame and the ordinate to obtain the first candidate pose; transform the position of the third map point in the three-dimensional coordinate system corresponding to the vector map according to the first candidate pose to obtain the position of the third map point in the preset image coordinate system position; keep the abscissa and ordinate of the predicted pose when the terminal device shoots the current image frame unchanged, change the heading angle of the predicted pose when the terminal device shoots the current image frame, and obtain the second candidate pose; according to the second candidate The pose transforms the position of the first feature point in the current image
  • the obtaining module 700 is used for: obtaining the third The position of the map point in the three-dimensional coordinate system corresponding to the vector map and the position of the first feature point in the current image frame; keep the heading angle, roll angle, pitch angle and vertical coordinates of the predicted pose when the terminal device captures the current image frame Change the abscissa and ordinate of the predicted pose when the terminal device captures the current image frame, and obtain the first candidate pose; according to the first candidate pose, the third map point in the three-dimensional coordinate system corresponding to the vector map is determined.
  • Transform the position to obtain the position of the third map point in the preset image coordinate system keep the abscissa, ordinate, vertical coordinate, roll angle and pitch angle of the predicted pose when the terminal device captures the current image frame unchanged, Change the heading angle of the predicted pose when the terminal device shoots the current image frame, and obtain the second candidate pose; transform the position of the first feature point in the current image frame according to the second candidate pose, and obtain the first feature point in the The position in the image coordinate system; according to the size of the distance between the position of the third map point in the image coordinate system and the position of the first feature point in the image coordinate system, from the first candidate pose and the second candidate pose Determine the third candidate pose in the combination of Four candidate poses; transform the position of the third map point in the three-dimensional coordinate system corresponding to the vector map according to the fourth candidate pose to obtain the position of the third map point in the current image frame; The size of the distance between the position in the image frame and the position of the third map point in the current image frame determines the pose when the terminal device captures the current
  • FIG. 8 is another schematic structural diagram of a terminal device positioning apparatus provided by an embodiment of the present application.
  • an embodiment of the computer in this embodiment of the present application may include one or more central processing units 801 , a memory 802 , an input/output interface 803 , a wired or wireless network interface 804 , and a power supply 805 .
  • the memory 802 may be ephemeral storage or persistent storage. Furthermore, the central processing unit 801 may be configured to communicate with the memory 802 to execute a series of instruction operations in the memory 802 on the computer.
  • the central processing unit 801 may execute the method steps in the foregoing embodiment shown in FIG. 2 , and details are not repeated here.
  • the division of specific functional modules in the central processing unit 801 may be similar to the division of modules such as the acquisition module, the first matching module, the second matching module, and the optimization module described in FIG. 7 , which is not repeated here. Repeat.
  • Embodiments of the present application also relate to a computer storage medium, including computer-readable instructions, when the computer-readable instructions are executed, the method described in FIG. 2 is implemented.
  • Embodiments of the present application also relate to a computer program product containing instructions, which, when run on a computer, cause the computer to execute the method described in FIG. 2 .
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, and can also be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供一种终端设备定位方法及其相关设备,可提高终端设备的定位结果的准确性。本申请的方法包括:从矢量地图中,获取与终端设备拍摄的当前图像帧的第一特征点匹配的第一地图点;从矢量地图中,获取与当前图像帧之前的其它图像帧的第二特征点匹配的第二地图点;根据目标函数对终端设备拍摄当前图像帧时的位姿进行调整,得到当前次调整后的终端设备拍摄当前图像帧时的位姿,作为终端设备的定位结果,目标函数包括第一特征点与第一地图点之间的第一匹配误差,以及第二特征点与第二地图点之间的第二匹配误差。

Description

一种终端设备定位方法及其相关设备
本申请要求于2021年4月27日提交中国专利局、申请号为202110460636.4、发明名称为“一种终端设备定位方法及其相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种终端设备定位方法及其相关设备。
背景技术
目前,自动驾驶汽车、无人机和机器人等智能终端设备已广泛应用于日常生活中。对于这部分可移动的终端设备而言,为了准确获取终端设备的实时位置,高精定位技术应运而生。
在对终端设备进行定位的过程中,可获取终端设备拍摄的当前图像帧,并从预置的矢量地图(该地图中,交通环境中的物体可通过地图点进行表示,例如,灯杆通过地图点汇聚成的直线进行表示,指示牌通过地图点汇聚成的矩形框进行表示等等)中获取与当前图像帧中用于呈现交通环境中的物体的特征点匹配的地图点,最后根据特征点与地图点之间的匹配结果确定终端设备在矢量地图中的定位结果。
然而,上述终端设备的定位过程所考虑的因素较为单一,导致终端设备的定位结果的准确性较低。
发明内容
本申请实施例提供了一种终端设备定位方法及其相关设备,可提高终端设备的定位结果的准确性。
本申请实施例的第一方面提供了一种终端设备定位方法,该方法包括:
终端设备在移动过程中,在当前时刻可通过摄像头对交通环境进行拍摄,得到当前图像帧。进一步地,终端设备还可获取当前图像帧之前的其它图像帧。那么,终端设备可根据当前图像帧以及其它图像帧,对自身进行定位。
具体地,终端设备先从矢量地图中,获取与当前图像帧的第一特征点匹配的第一地图点。例如,当前图像帧中用于呈现红绿灯的特征点以及矢量地图中用于表示红绿灯的地图点为相匹配的点,当前图像帧中用于呈现车道线的特征点以及矢量地图中用于表示车道线的地图点为相匹配的点等等。同样地,终端设备还可从矢量地图中,获取与当前图像帧之前的其它图像帧的第二特征点匹配的第二地图点。
由于第一特征点和第一地图点之间存在一定的匹配误差,第二特征点和第二地图点之间也存在一定的匹配误差,因此,需要令这两个匹配误差尽可能地小,才能提高终端设备的定位结果的准确性。
基于此,终端设备可根据第一特征点与第一地图点之间的第一匹配误差,以及第二特征点与第二地图点之间的第二匹配误差构建目标函数,并根据目标函数对终端设备拍摄当前图像帧时的位姿进行调整,即根据目标函数对终端设备拍摄当前图像帧时的位姿进行优化,直 至目标函数收敛,从而得到当前次调整(优化)后的终端设备拍摄当前图像帧时的位姿,将其作为终端设备在矢量地图中的定位结果。其中,终端设备拍摄当前图像帧时的位姿通常指在拍摄当前图像帧时,终端设备在矢量地图对应的三维坐标系中所处的位姿。
从上述方法可以看出:在得到当前图像帧和当前图像帧之前的其它图像帧后,可从矢量地图中,获取与当前图像帧的第一特征点匹配的第一地图点,以及与当前图像帧之前的其它图像帧的第二特征点匹配的第二地图点。然后,可根据第一特征点与第一地图点之间的第一匹配误差,以及第二特征点与第二地图点之间的第二匹配误差构建的目标函数,对终端设备拍摄当前图像帧时的位姿进行调整,得到当前次调整后的终端设备拍摄当前图像帧时的位姿。前述过程中,由于目标函数既包含当前图像帧的特征点与矢量地图的地图点之间的匹配误差,以及其它图像帧的特征点与矢量地图的地图点之间的匹配误差,故通过该目标函数对终端设备拍摄当前图像帧时的位姿进行调整,不仅考虑了当前图像帧对终端设备拍摄当前图像帧时的位姿的优化过程造成的影响,还考虑了其它图像帧对终端设备拍摄当前图像帧时的位姿的优化过程造成的影响,即考虑了当前图像帧以及其它图像帧之间的关联性,所考虑的因素更全面,故基于此种方式得到的终端设备的定位结果,具备更高的准确性。
在一种可能的实现方式中,该方法还包括:获取终端设备拍摄当前图像帧时的位姿和上一次调整后的终端设备拍摄其它图像帧时的位姿,并对当前图像帧以及当前图像帧之前的其它图像帧进行语义检测,从而得到当前图像帧的第一特征点和当前图像帧之前的其它图像帧的第二特征点。然后,可根据终端设备拍摄当前图像帧时的位姿,从矢量地图中获取与第一特征点匹配的第一地图点,并根据上一次调整后的终端设备拍摄其它图像帧时的位姿,从矢量地图中获取与第二特征点匹配的第二地图点。如此一来,可完成特征点与地图点之间的关联匹配。
在一种可能的实现方式中,根据目标函数对终端设备拍摄当前图像帧时的位姿进行调整,得到当前次调整后的终端设备拍摄当前图像帧时的位姿包括:在获取第一特征点在第一坐标系中的位置以及第一地图点在第一坐标系中的位置后,根据第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离进行计算,可得到第一匹配误差的初始值。接着,在获取第二特征点在第二坐标系中的位置以及第二地图点在第二坐标系中的位置后,根据第二特征点在第二坐标系中的位置与第二地图点在第二坐标系中的位置之间的距离进行计算,可得到第二匹配误差的初始值。最后,根据第一匹配误差的初始值以及第二匹配误差的初始值,对目标函数进行迭代求解,直至满足预置的迭代条件,得到当前次调整后的终端设备拍摄当前图像帧时的位姿。前述实现方式中,在完成第一特征点与第一地图点之间的匹配,第二特征点与第二地图点之间的匹配后,可计算出第一特征点与第一地图点之间的第一匹配误差的初始值,以及第二特征点与第二地图点之间的第二匹配误差的初始值,从而结合这两个初始值对对目标函数进行迭代求解,相当于根据当前图像帧和其它图像帧对终端设备拍摄当前图像帧时的位姿进行调整,所考虑的因素更全面,从而准确地得到终端设备的定位结果。
在一种可能的实现方式中,第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离包括以下至少一种:(1)第一特征点在当前图像帧中的位置,与第一地图点在当前图像帧中的位置之间的距离,其中,第一地图点在当前图像帧中的位置根据第 一地图点在矢量地图对应的三维坐标系中的位置以及终端设备拍摄当前图像帧时的位姿得到。(2)第一特征点在矢量地图对应的三维坐标系中的位置,与第一地图点在矢量地图对应的三维坐标系中的位置之间的距离,其中,第一特征点在矢量地图对应的三维坐标系中的位置根据第一特征点在当前图像帧中的位置以及终端设备拍摄当前图像帧时的位姿得到。(3)第一特征点在终端设备对应的三维坐标系中的位置,与第一地图点在终端设备对应的三维坐标系中的位置之间的距离,其中,第一特征点在终端设备对应的三维坐标系中的位置根据第一特征点在当前图像帧中的位置以及终端设备拍摄当前图像帧时的位姿得到,第一地图点在终端设备对应的三维坐标系中的位置根据第一地图点在矢量地图对应的三维坐标系中的位置以及终端设备拍摄当前图像帧时的位姿得到。
在一种可能的实现方式中,第二特征点在第二坐标系中的位置与第二地图点在第二坐标系中的位置之间的距离包括以下至少一种:(1)第二特征点在其它图像帧中的位置,与第二地图点在其它图像帧中的位置之间的距离,其中,第二地图点在其它图像帧中的位置根据第二地图点在矢量地图对应的三维坐标系中的位置以及上一次调整后的终端设备拍摄其它图像帧时的位姿得到。(2)第二特征点在矢量地图对应的三维坐标系中的位置,与第二地图点在矢量地图对应的三维坐标系中的位置之间的距离,其中,第二特征点在矢量地图对应的三维坐标系中的位置根据第二特征点在其它图像帧中的位置以及上一次调整后的终端设备拍摄其它图像帧时的位姿得到。(3)第二特征点在终端设备对应的三维坐标系中的位置,与第二地图点在终端设备对应的三维坐标系中的位置之间的距离,其中,第二特征点在终端设备对应的三维坐标系中的位置根据第二特征点在其它图像帧中的位置以及上一次调整后的终端设备拍摄其它图像帧时的位姿得到,第二地图点在终端设备对应的三维坐标系中的位置根据第二地图点在矢量地图对应的三维坐标系中的位置以及上一次调整后的终端设备拍摄其它图像帧时的位姿得到。
在一种可能的实现方式中,迭代条件为:对于任意一次迭代,若该次迭代所得到的帧间位姿差与终端设备计算的帧间位姿差之间的差值小于预置的阈值,则停止迭代,该次迭代得到的帧间位姿差根据该次迭代得到的终端设备拍摄当前图像帧时的位姿以及该次迭代得到的终端设备拍摄其它图像帧时的位姿确定,帧间位姿差为在当前图像帧和其它图像帧中,终端设备拍摄相邻两个图像帧之间的位姿差;若差值大于或等于阈值,则执行下一次迭代,直至迭代次数等于预置的次数。
在一种可能的实现方式中,其它图像帧的数量可随着终端设备的运动状态的变化而变化。具体地,其它图像帧的数量可根据终端设备的速度确定。
在一种可能的实现方式中,获取终端设备拍摄当前图像帧时的位姿包括:根据上一次调整后的终端设备拍摄其它图像帧时的位姿和终端设备计算的帧间位姿差,计算终端设备拍摄当前图像帧时的预测位姿;对终端设备拍摄当前图像帧时的预测位姿进行分层采样,得到终端设备拍摄当前图像帧时的位姿。前述实现方式中,通过分层采样所得到的终端设备拍摄当前图像帧时的位姿,可作为当前次调整的初始位姿,从而提高当前次调整的收敛速度和鲁棒性。
在一种可能的实现方式中,若终端设备拍摄当前图像帧时的位姿包含横坐标、纵坐标和航向角,则对终端设备拍摄当前图像帧时的预测位姿进行分层采样,得到终端设备拍摄当前 图像帧时的位姿包括:获取第三地图点在矢量地图对应的三维坐标系中的位置和第一特征点在当前图像帧中的位置;保持终端设备拍摄当前图像帧时的预测位姿的航向角不变,改变终端设备拍摄当前图像帧时的预测位姿的横坐标和纵坐标,得到第一候选位姿;根据第一候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在预置的图像坐标系中的位置;保持终端设备拍摄当前图像帧时的预测位姿的横坐标和纵坐标不变,改变终端设备拍摄当前图像帧时的预测位姿的航向角,得到第二候选位姿;根据第二候选位姿对第一特征点在当前图像帧中的位置进行变换,得到第一特征点在图像坐标系中的位置;根据第三地图点在图像坐标系中的位置和第一特征点在图像坐标系中的位置之间的距离的大小,从第一候选位姿和第二候选位姿的组合中确定终端设备拍摄当前图像帧时的位姿。通过前述的位姿采样方式,可有效减少位姿采样过程中所需的计算量。
在一种可能的实现方式中,若终端设备拍摄当前图像帧时的位姿包含横坐标、纵坐标、竖坐标、航向角、滚动角和俯仰角,则对终端设备拍摄当前图像帧时的预测位姿进行分层采样,得到终端设备拍摄当前图像帧时的位姿包括:获取第三地图点在矢量地图对应的三维坐标系中的位置和第一特征点在当前图像帧中的位置;保持终端设备拍摄当前图像帧时的预测位姿的航向角、滚动角、俯仰角和竖坐标不变,改变终端设备拍摄当前图像帧时的预测位姿的横坐标和纵坐标,得到第一候选位姿;根据第一候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在预置的图像坐标系中的位置;保持终端设备拍摄当前图像帧时的预测位姿的横坐标、纵坐标、竖坐标、滚动角和俯仰角不变,改变终端设备拍摄当前图像帧时的预测位姿的航向角,得到第二候选位姿;根据第二候选位姿对第一特征点在当前图像帧中的位置进行变换,得到第一特征点在图像坐标系中的位置;根据第三地图点在图像坐标系中的位置和第一特征点在图像坐标系中的位置之间的距离的大小,从第一候选位姿和第二候选位姿的组合中确定第三候选位姿;保持第三候选位姿的预测位姿的横坐标、纵坐标、航向角和滚动角不变,改变第三候选位姿的俯仰角和竖坐标,得到第四候选位姿;根据第四候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在当前图像帧中的位置;根据第一特征点在当前图像帧中的位置和第三地图点在当前图像帧中的位置之间的距离的大小,从第四候选位姿确定终端设备拍摄当前图像帧时的位姿。通过前述的位姿采样方式,可有效减少位姿采样过程中所需的计算量。
本申请实施例的第二方面提供了一种终端设备定位装置,该装置包括:第一匹配模块,用于从矢量地图中,获取与当前图像帧的第一特征点匹配的第一地图点;第二匹配模块,用于从矢量地图中,获取与当前图像帧之前的其它图像帧的第二特征点匹配的第二地图点;优化模块,用于根据目标函数对终端设备拍摄当前图像帧时的位姿进行调整,得到当前次调整后的终端设备拍摄当前图像帧时的位姿,作为终端设备的定位结果,目标函数包括第一特征点与第一地图点之间的第一匹配误差,以及第二特征点与第二地图点之间的第二匹配误差。
从上述装置可以看出:在得到当前图像帧和当前图像帧之前的其它图像帧后,可从矢量地图中,获取与当前图像帧的第一特征点匹配的第一地图点,以及与当前图像帧之前的其它图像帧的第二特征点匹配的第二地图点。然后,可根据第一特征点与第一地图点之间的第一匹配误差,以及第二特征点与第二地图点之间的第二匹配误差构建的目标函数,对终端设备拍摄当前图像帧时的位姿进行调整,得到当前次调整后的终端设备拍摄当前图像帧时的位姿。 前述过程中,由于目标函数既包含当前图像帧的特征点与矢量地图的地图点之间的匹配误差,以及其它图像帧的特征点与矢量地图的地图点之间的匹配误差,故通过该目标函数对终端设备拍摄当前图像帧时的位姿进行调整,不仅考虑了当前图像帧对终端设备拍摄当前图像帧时的位姿的优化过程造成的影响,还考虑了其它图像帧对终端设备拍摄当前图像帧时的位姿的优化过程造成的影响,即考虑了当前图像帧以及其它图像帧之间的关联性,所考虑的因素更全面,故基于此种方式得到的终端设备的定位结果,具备更高的准确性。
在一种可能的实现方式中,该装置还包括:获取模块,用于获取当前图像帧的第一特征点、当前图像帧之前的其它图像帧的第二特征点、终端设备拍摄当前图像帧时的位姿和上一次调整后的终端设备拍摄其它图像帧时的位姿;第一匹配模块,用于根据终端设备拍摄当前图像帧时的位姿,从矢量地图中,获取与第一特征点匹配的第一地图点;第二匹配模块,用于根据上一次调整后的终端设备拍摄其它图像帧时的位姿,从矢量地图中,获取与第二特征点匹配的第二地图点。
在一种可能的实现方式中,优化模块,用于:根据第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离进行计算,得到第一匹配误差的初始值;根据第二特征点在第二坐标系中的位置与第二地图点在第二坐标系中的位置之间的距离进行计算,得到第二匹配误差的初始值;根据第一匹配误差的初始值以及第二匹配误差的初始值,对目标函数进行迭代求解,直至满足预置的迭代条件,得到当前次调整后的终端设备拍摄当前图像帧时的位姿。
在一种可能的实现方式中,第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离包括以下至少一种:第一特征点在当前图像帧中的位置,与第一地图点在当前图像帧中的位置之间的距离;第一特征点在矢量地图对应的三维坐标系中的位置,与第一地图点在矢量地图对应的三维坐标系中的位置之间的距离;第一特征点在终端设备对应的三维坐标系中的位置,与第一地图点在终端设备对应的三维坐标系中的位置之间的距离。
在一种可能的实现方式中,第二特征点在第二坐标系中的位置与第二地图点在第二坐标系中的位置之间的距离包括以下至少一种:第二特征点在其它图像帧中的位置,与第二地图点在其它图像帧中的位置之间的距离;第二特征点在矢量地图对应的三维坐标系中的位置,与第二地图点在矢量地图对应的三维坐标系中的位置之间的距离;第二特征点在终端设备对应的三维坐标系中的位置,与第二地图点在终端设备对应的三维坐标系中的位置之间的距离。
在一种可能的实现方式中,迭代条件为:对于任意一次迭代,若该次迭代所得到的帧间位姿差与终端设备计算的帧间位姿差之间的差值小于预置的阈值,则停止迭代,该次迭代得到的帧间位姿差根据该次迭代得到的终端设备拍摄当前图像帧时的位姿以及该次迭代得到的终端设备拍摄其它图像帧时的位姿确定,帧间位姿差为在当前图像帧和其它图像帧中,终端设备拍摄相邻两个图像帧之间的位姿差;若差值大于或等于阈值,则执行下一次迭代,直至迭代次数等于预置的次数。
在一种可能的实现方式中,其它图像帧的数量根据终端设备的速度确定。
在一种可能的实现方式中,获取模块,用于:根据上一次调整后的终端设备拍摄其它图像帧时的位姿和终端设备计算的帧间位姿差,计算终端设备拍摄当前图像帧时的预测位姿;对终端设备拍摄当前图像帧时的预测位姿进行分层采样,得到终端设备拍摄当前图像帧时的 位姿。
在一种可能的实现方式中,若终端设备拍摄当前图像帧时的位姿包含横坐标、纵坐标和航向角,则获取模块,用于:获取第三地图点在矢量地图对应的三维坐标系中的位置和第一特征点在当前图像帧中的位置;保持终端设备拍摄当前图像帧时的预测位姿的航向角不变,改变终端设备拍摄当前图像帧时的预测位姿的横坐标和纵坐标,得到第一候选位姿;根据第一候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在预置的图像坐标系中的位置;保持终端设备拍摄当前图像帧时的预测位姿的横坐标和纵坐标不变,改变终端设备拍摄当前图像帧时的预测位姿的航向角,得到第二候选位姿;根据第二候选位姿对第一特征点在当前图像帧中的位置进行变换,得到第一特征点在图像坐标系中的位置;根据第三地图点在图像坐标系中的位置和第一特征点在图像坐标系中的位置之间的距离的大小,从第一候选位姿和第二候选位姿的组合中确定终端设备拍摄当前图像帧时的位姿。通过前述的位姿采样方式,可有效减少位姿采样过程中所需的计算量。
在一种可能的实现方式中,若终端设备拍摄当前图像帧时的位姿包含横坐标、纵坐标、竖坐标、航向角、滚动角和俯仰角,则获取模块,用于:获取第三地图点在矢量地图对应的三维坐标系中的位置和第一特征点在当前图像帧中的位置;保持终端设备拍摄当前图像帧时的预测位姿的航向角、滚动角、俯仰角和竖坐标不变,改变终端设备拍摄当前图像帧时的预测位姿的横坐标和纵坐标,得到第一候选位姿;根据第一候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在预置的图像坐标系中的位置;保持终端设备拍摄当前图像帧时的预测位姿的横坐标、纵坐标、竖坐标、滚动角和俯仰角不变,改变终端设备拍摄当前图像帧时的预测位姿的航向角,得到第二候选位姿;根据第二候选位姿对第一特征点在当前图像帧中的位置进行变换,得到第一特征点在图像坐标系中的位置;根据第三地图点在图像坐标系中的位置和第一特征点在图像坐标系中的位置之间的距离的大小,从第一候选位姿和第二候选位姿的组合中确定第三候选位姿;保持第三候选位姿的预测位姿的横坐标、纵坐标、航向角和滚动角不变,改变第三候选位姿的俯仰角和竖坐标,得到第四候选位姿;根据第四候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在当前图像帧中的位置;根据第一特征点在当前图像帧中的位置和第三地图点在当前图像帧中的位置之间的距离的大小,从第四候选位姿确定终端设备拍摄当前图像帧时的位姿。通过前述的位姿采样方式,可有效减少位姿采样过程中所需的计算量。
本申请实施例的第三方面提供了一种终端设备定位装置,该装置包括存储器和处理器;存储器存储有代码,处理器被配置为执行代码,当代码被执行时,终端设备定位装置执行如第一方面或第一方面的任意一种可能的实现方式所述的方法。
本申请实施例的第四方面提供了一种车辆,该车辆包含如第三方面所述的终端设备定位装置。
本申请实施例的第五方面提供了一种计算机存储介质,计算机存储介质存储有计算机程序,该程序由计算机执行时,使得计算机实施如第一方面或第一方面的任意一种可能的实现方式所述的方法。
本申请实施例的第六方面提供了一种计算机程序产品,计算机程序产品存储有指令,指令在由计算机执行时,使得计算机实施如第一方面或第一方面的任意一种可能的实现方式所 述的方法。
本申请实施例中,在得到当前图像帧和当前图像帧之前的其它图像帧后,可从矢量地图中,获取与当前图像帧的第一特征点匹配的第一地图点,以及与当前图像帧之前的其它图像帧的第二特征点匹配的第二地图点。然后,可根据第一特征点与第一地图点之间的第一匹配误差,以及第二特征点与第二地图点之间的第二匹配误差构建的目标函数,对终端设备拍摄当前图像帧时的位姿进行调整,得到当前次调整后的终端设备拍摄当前图像帧时的位姿。前述过程中,由于目标函数既包含当前图像帧的特征点与矢量地图的地图点之间的匹配误差,以及其它图像帧的特征点与矢量地图的地图点之间的匹配误差,故通过该目标函数对终端设备拍摄当前图像帧时的位姿进行调整,不仅考虑了当前图像帧对终端设备拍摄当前图像帧时的位姿的优化过程造成的影响,还考虑了其它图像帧对终端设备拍摄当前图像帧时的位姿的优化过程造成的影响(即考虑了当前图像帧以及其它图像帧之间的关联性),所考虑的因素更全面,故基于此种方式得到的终端设备的定位结果,具备更高的准确性。
附图说明
图1为矢量地图的一个示意图;
图2为本申请实施例提供的终端设备的定位方法的一个流程示意图;
图3为本申请实施例提供的终端设备对应的三维坐标系的一个示意图;
图4为本申请实施例提供的帧间位姿差的一个示意图;
图5为本申请实施例提供的当前图像帧的第一特征点的一个示意图;
图6为本申请实施例提供的计算重叠程度的一个示意图;
图7为本申请实施例提供的终端设备定位装置的一个结构示意图;
图8为本申请实施例提供的终端设备定位装置的另一结构示意图。
具体实施方式
本申请实施例提供了一种终端设备定位方法及其相关设备,可提高终端设备的定位结果的准确性。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”并他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、***、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
本申请实施例可通过终端设备实现,例如,汽车上的车载设备、无人机、机器人等等。为了便于说明,下文将汽车上的车载设备简称为汽车,并以行驶中的汽车为例进行介绍。
当汽车在行驶时,若用户欲确定汽车所在的位置,则需对汽车进行高精定位。在相关技术中,汽车内部通常预置有完整的矢量地图,图1为矢量地图的一个示意图,如图1所示,矢量地图可显示出汽车当前所处的虚拟交通环境,该交通环境包含汽车周围的各个物体,例如,红绿灯、灯杆、指示牌、车道线等等。这些物体可通过矢量地图上的像素点进行表示, 即通过矢量地图中的地图点进行表示。例如,灯杆可以通过多个地图点汇集而成的直线表示,指示牌可以通过多个地图点汇集而成的矩形框表示等等。值得注意的是,矢量地图所显示的虚拟交通环境是根据真实世界中的交通环境进行绘制的,而矢量地图所显示的汽车的位姿一般是汽车通过计算得到的,可能与真实世界中汽车的真实位姿存在差异,因此,需要对矢量地图中汽车的位姿进行修正和优化,从而提高汽车的定位结果的准确性。可以理解的是,汽车的位姿通常包含汽车的位置和汽车的朝向,后续不再赘述。
具体地,在行驶过程中,汽车可拍摄当前图像帧,用于呈现当前时刻汽车所处的真实交通环境。接着,汽车可将当前图像帧的特征点与矢量地图的地图点进行匹配,相当于将汽车所处的真实交通环境和汽车所处的虚拟交通环境进行匹配。最后,根据当前图像帧的特征点与矢量地图的地图点之间的匹配结果,例如,二者之间的匹配误差等等,对矢量地图中汽车的位姿进行调整,并以优化后的汽车的位姿作为汽车的定位结果。
然而,仅通过当前图像帧确定汽车的定位结果,所考虑的因素较为单一,导致汽车的定位结果的准确性较低。
基于此,为了提高终端设备的定位结果的准确性,本申请实施例提供了一种终端设备的定位方法。为了便于说明,下文将终端设备拍摄任意一个图像帧时的位姿,简称为该图像帧的位姿,例如,终端设备拍摄当前图像帧时的位姿,可简称为当前图像帧的位姿,又如,终端设备拍摄位于当前图像帧之前的其它图像帧时的位姿,可简称为其它图像帧的位姿等等,再如,对终端设备拍摄当前图像帧时的位姿进行当前次优化(调整)后,可得到当前次优化(调整)后的终端设备拍摄当前图像帧时的位姿,即当前次优化后的当前次图像帧的位姿等等,后续不再赘述。图2为本申请实施例提供的终端设备的定位方法的一个流程示意图,如图2所示,该方法包括:
201、获取当前图像帧的第一特征点、当前图像帧之前的其它图像帧的第二特征点、当前图像帧的位姿和上一次优化后的其它图像帧的位姿。
本实施例中,终端设备具有摄像头。在移动过程中,终端设备可通过摄像头对当前的交通环境进行拍摄,得到当前图像帧。进一步地,终端设备还可获取当前图像帧之前的其它图像帧,其它图像帧的数量可根据终端设备的速度确定,如公式(1)所示:
Figure PCTCN2022089007-appb-000001
上式中,t为当前图像帧和其它图像帧的数量,t-1为其它图像帧的数量,t 0为预设的阈值,α为预设的调整系数,v为当前时刻的终端设备的速度。如此一来,得到当前图像帧以及其它图像帧后,终端设备可根据当前图像帧以及其它图像帧,对自身进行定位。
终端设备得到当前图像帧和其它图像帧后,可获取当前图像帧的位姿以及上一次优化后的其它图像帧的位姿。需要说明的是,对于当前图像帧的位姿而言,可根据当前图像帧以及其它图像帧,对当前图像帧的位姿进行当前次的优化,以得到当前次优化后的当前图像帧的位姿以及当前次优化后的其它图像帧的位姿。由此可见,上一次优化后的其它图像帧的位姿为:根据其它图像帧对其它图像帧的位姿进行上一次的优化后,所得到的结果。
当前图像帧的位姿可通过如下方式获取:首先,根据上一次优化后的其它图像帧的位姿和终端设备计算的帧间位姿差,计算当前图像帧的预测位姿。然后,对当前图像帧的预测位姿进行分层采样,得到当前图像帧的位姿。
具体地,终端设备还可具有里程计,里程计可构建终端设备所对应的三维坐标系,例如,车体坐标系等等。图3为本申请实施例提供的终端设备对应的三维坐标系的一个示意图,如图3所示,在该三维坐标系中,原点为终端设备的运动起始点,X轴指向处于运动起始点时的终端设备的正前方,Y轴指向处于运动起始点时的终端设备的左侧,Z轴可默认为零。那么,当终端设备从原点处开始运动后,其位置和朝向在不断地发生变化(即发生旋转和平移),里程计可在终端设备的运动过程中,可计算终端设备在拍摄相邻两个图像帧时所对应的位姿差,该位姿差即相邻两个图像帧之间的位姿差,也可称为帧间位姿差。帧间位姿差可通过公式(2)表示:
ΔT={ΔR,Δt}      (2)
上式中,ΔT为帧间位姿差,ΔR为相邻两个图像帧之间的旋转,Δt为相邻两个图像帧之间的平移。
为了进一步理解帧间位姿差,下文结合图4对其做进一步的介绍。图4为本申请实施例提供的帧间位姿差的一个示意图,如图4所示,设当前图像帧和当前图像帧之前的其它图像帧一共t帧。其中,F 1表示其它图像帧中的第一个图像帧,F 2表示其它图像帧中的第二个图像帧,…,F t-1表示其它图像帧中的最后一个图像帧(即当前图像帧的前一个图像帧),F t表示当前图像帧。里程计可计算得到F 1和F 2之间的位姿差为ΔT t-1,…,得到F t-1和F t之间的位姿差为ΔT t-1。那么,当前图像帧的预测位姿可通过公式(3)计算得到:
P t=P t-1*ΔT t-1        (3)
上式中,P t为当前图像帧的预测位姿,P t-1为上一次优化后的前一个图像帧的位姿。基于公式(3),当前图像帧的预测位姿也可通过公式(4)计算得到:
P t=(ΔT t-1*ΔT t-2*…*ΔT t-m)*P t-m     (4)
上式中,P t-m为上一次优化后的其它图像帧中的第t-m个图像帧的位姿。
得到当前图像帧的预测位姿后,可对预测位姿进行分层采样,从而得到当前图像帧的位姿,即用于进行当前次优化的位姿初始值。具体地,可通过多种方式对当前图像帧的预测位姿进行分层采样,下文将分别进行介绍:
在一种可能的实现方式中,若当前图像帧的位姿为三自由度的量,即包含横坐标、纵坐标和航向角,分层采样的过程包括:(一)可在矢量地图中提前划定的范围内任意选取部分地图点,作为第三地图点,并获取第三地图点在矢量地图对应的三维坐标系中的位置和第一特征点在当前图像帧中的位置。可以理解的是,第三地图点在矢量地图对应的三维坐标系中的位置为三维坐标,第一特征点在当前图像帧中的位置为二维坐标。(二)保持当前图像帧的预测位姿的航向角不变,改变当前图像帧的预测位姿的横坐标和纵坐标,得到第一候选位姿。(三)根据第一候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在预置的图像坐标系中的位置,该过程相当于将第三地图点投影至图像坐标系中。(四)保持当前图像帧的预测位姿的横坐标和纵坐标不变,改变当前图像帧的预测位姿的航向角,得到第二候选位姿。(五)根据第二候选位姿对第一特征点在当前图像帧中的位置进 行变换,得到第一特征点在图像坐标系中的位置,该过程相当于将第一特征点投影至图像坐标系中。(六)根据第三地图点在图像坐标系中的位置和第一特征点在图像坐标系中的位置之间的距离的大小,从第一候选位姿和第二候选位姿的组合中确定当前图像帧的位姿。通过前述的位姿采样方式,可有效减少位姿采样过程中所需的计算量。
为了进一步理解上述采样过程,下文结合一个例子进行介绍,该例子包括:(一)确定当前图像帧的第一特征点,以及在矢量地图中用于分层采样的第三地图点。(二)保持当前图像帧的航向角不变,在横坐标的原始值的基础上采样N1次,并在纵坐标的原始值的基础上采样N2次,得到N1×N2个第一候选位姿。(三)根据每一个第一候选位姿,将矢量地图中的第三地图点投影至预置的图像坐标系,得到N1×N2组新的第三地图点。(四)保持当前图像帧的预测位姿的横坐标和纵坐标不变,在航向角的原始值的基础上采样N3次,得到N3个第二候选位姿。(五)根据每一个第二候选位姿,将当前图像帧的第一特征点投影至预置的图像坐标系,得到N3组新的第一特征点。(六)在预置的图像坐标系中,根据N1×N2组新的第三地图点和N3组新的第一特征点,构成N1×N2×N3个新的位姿组合,计算每一个组合中第三地图点与第一特征点之间的距离,即可得到N1×N2×N3个距离,从中选择最小的距离,并以该距离对应的第一候选位姿的横坐标和纵坐标,以及该距离对应的第二候选位姿的航向角,组成当前图像帧的位姿。
在另一种可能的实现方式中,若当前图像帧的位姿为六自由度的量,即包含横坐标、纵坐标、竖坐标、航向角、滚动角和俯仰角,则对分层采样的过程包括:(一)获取第三地图点在矢量地图对应的三维坐标系中的位置和第一特征点在当前图像帧中的位置。(二)保持当前图像帧的预测位姿的航向角、滚动角、俯仰角和竖坐标不变,改变当前图像帧的预测位姿的横坐标和纵坐标,得到第一候选位姿。(三)根据第一候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在预置的图像坐标系中的位置。(四)保持当前图像帧的预测位姿的横坐标、纵坐标、竖坐标、滚动角和俯仰角不变,改变当前图像帧的预测位姿的航向角,得到第二候选位姿。(五)根据第二候选位姿对第一特征点在当前图像帧中的位置进行变换,得到第一特征点在图像坐标系中的位置。(六)根据第三地图点在图像坐标系中的位置和第一特征点在图像坐标系中的位置之间的距离的大小,从第一候选位姿和第二候选位姿的组合中确定第三候选位姿。(七)保持第三候选位姿的预测位姿的横坐标、纵坐标、航向角和滚动角不变,改变第三候选位姿的俯仰角和竖坐标,得到第四候选位姿。(八)根据第四候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在当前图像帧中的位置。(九)根据第一特征点在当前图像帧中的位置和第三地图点在当前图像帧中的位置之间的距离的大小,从第四候选位姿确定当前图像帧的位姿。通过前述的位姿采样方式,可有效减少位姿采样过程中所需的计算量。
为了进一步理解上述采样过程,下文结合一个例子进行介绍,该例子包括步骤(一)至(九),其中,步骤(一)至(五)可参考前述例子中的步骤(一)至(五),此处不再赘述。(六)在预置的图像坐标系中,根据N1×N2组新的第三地图点和N3组新的第一特征点,构成N1×N2×N3个新的组合,计算每一个组合中第三地图点与第一特征点之间的距离,可得到N1×N2×N3个距离,从中选择最小的距离,并以该距离对应的第一候选位姿的横坐标和纵坐标,以及该距离对应的第二候选位姿的航向角,组成第三候选位姿。(七)保持第三候选位姿 的预测位姿的横坐标、纵坐标、航向角和滚动角不变,在俯仰角的原始值的基础上采样N4次,并在竖坐标的原始值的基础上采样N5次,得到N4×N5个第四候选位姿。(八)根据每一个第四候选位姿,将矢量地图中的第三地图点投影至当前图像帧,得到N4×N5组新的第三地图点。(九)在当前图像帧中,根据N4×N5组新的第三地图点和当前图像帧中的第一特征点,构建N4×N5新的组合,计算每一个组合中第三地图点与第一特征点之间的距离,可得到N4×N5个距离,从中选择最小的距离,并以该距离对应的第四候选位姿的俯仰角和竖坐标,以及第三候选位姿的横坐标、纵坐标、航向角和滚动角,组成当前图像帧的位姿。
得到当前图像帧的位姿和上一次优化后的其它图像帧的位姿后,还可对当前图像帧和其它图像帧进行语义检测,从而得到当前图像帧的第一特征点和其它图像帧的第二特征点。具体地,可通过神经网络对当前图像帧和其它图像帧分别进行语义检测处理,即进行特征提取,从而得到当前图像帧的第一特征点和其它图像帧的第二特征点,第一特征点和第二特征点可以理解为图像上的语义标识。需要说明的是,当前图像帧的第一特征点包括交通环境中各类物体的特征点,图5为本申请实施例提供的当前图像帧的第一特征点的一个示意图,如图5所示,灯杆的特征点和车道线的特征点可以为两个端部的像素点,红绿灯的特征点和指示牌的特征点可以为多个像素点汇集成的矩形框(即外包围盒)等等。同理,其它图像帧的第二特征点也是如此,此处不再赘述。
应理解,前述的神经网络为经过训练后的神经网络模型。下文将对该神经网络的训练过程进行简单的介绍:
在进行模型训练前,获取某一批待训练图像帧,并提前确定每个待训练图像帧中的真实特征点。开始训练后,可向待训练模型输入每个待训练图像帧。然后,通过待训练模型获取每个待训练图像帧的特征点,这部分特征点为预测特征点。最后,通过目标损失函数计算每个待训练图像帧的特征点和相应图像帧的真实特征点之间的差距,若某个待训练图像帧对应的两部分特征点之间的差距在合格范围内,则视为合格的待训练图像帧,若在合格范围外,则视为不合格的待训练图像帧。若该批待训练图像帧中,仅有少量合格的待训练图像帧,则调整待训练模型的参数,并重新用另一批待训练图像帧进行训练,直至存在大量合格的待训练图像帧,以得到用于进行语义检测的神经网络。
还应理解,本实施例中,当前图像帧的位姿通常指在拍摄当前图像帧时,终端设备在矢量地图对应的三维坐标系中所处的位姿,同理,其它图像帧的位姿指在拍摄其它图像帧时,终端设备在矢量地图对应的三维坐标系中所处的位姿等等。
还应理解,上一次优化的过程可参考当前次优化的过程,同理,下一次优化的过程也可参考当前次优化的过程等等。
还应理解,在终端设备拍摄的所有图像帧中,第一个图像帧的位姿可通过终端设备的全球定位***(global positioning system,GPS)获取,以作为第一次优化的对象。
202、根据当前图像帧的位姿,从矢量地图中,获取与第一特征点匹配的第一地图点。
经过步骤201得到当前图像帧的位姿后,即得到用于当前次优化的当前图像帧的位姿初始值,可基于该位姿从终端设备内部预置的矢量地图中,获取与第一特征点匹配的第一地图点。具体地,可通过多种方式获取与第一特征点匹配的第一地图点,下文将分别进行介绍:
在一种可能的实现方式中,可在矢量地图中划定一个包含终端设备的区域,例如,一个 150m×150m的范围,并根据当前图像帧的位姿对该区域中的多个地图点在矢量地图对应的三维坐标系中的位置,进行坐标变换计算,从而得到这部分地图点在当前图像帧中的位置,该过程相当于根据当前图像帧的位姿将该区域内的多个地图点投影至当前图像帧中。由于当前图像帧的第一特征点包括各类物体的特征点,该区域内的多个地图点也包含各类物体的地图点,因此,可通过最近邻算法,对第一特征点以及这部分地图点在当前图像帧中的位置进行计算,以将第一特征点与这部分地图点进行同类物体之间的匹配,从而在这部分地图点中,确定与第一特征点匹配的第一地图点。例如,在矢量地图中,灯杆这一类物体可通过多个地图点汇集而成的直线进行表示,该直线投影至当前图像帧中依旧为一根直线,后续称为投影直线。而灯杆这一类物体在当前图像帧中则通过两个端部的特征点进行表示,后续将端部的特征点称为端点。那么,当矢量地图中有灯杆A、灯杆B、灯杆C投影至当前图像帧时,为了确定哪一个灯杆与当前图像帧中的灯杆D匹配,可计算灯杆D的两个端点到灯杆A的投影直线的距离的平均值,灯杆D的两个端点到灯杆B的投影直线的距离的平均值,以及灯杆D的两个端点到灯杆C的投影直线的距离的平均值,并将平均值最小的灯杆确定为与灯杆D匹配的灯杆,该灯杆的地图点则与灯杆D的特征点相匹配。又如,在矢量地图中,指示牌(或红绿灯)这一类物体可通过多个地图点汇集而成的矩形框进行表示,该矩形框投影至当前图像帧中依旧为一个矩形框,而指示牌这一类物体在当前图像帧中同样也通过多个特征点汇集而成的矩形框进行表示。那么,当矢量地图中有指示牌X、指示牌Y投影至当前图像帧时,为了确定哪一个指示牌与当前图像帧中的指示牌Z匹配,可计算指示牌Z的矩形框上四个顶点到指示牌X的矩形框中两条平行边的投影直线的距离的平均值,以及指示牌Z的矩形框上四个顶点到指示牌Y的矩形框中两条平行边的投影直线的距离的平均值,并将平均值最小的指示牌确定为与指示牌Z匹配的指示牌,该指示牌的地图点则与指示牌Z的特征点相匹配。再如,在矢量地图中,车道线这一类物体可通过多个地图点汇集而成的直线进行表示,该直线投影至当前图像帧中依旧为一根直线,而车道线这一类物体在当前图像帧中则通过两个端部的特征点进行表示。那么,当矢量地图中有车道线E、车道线F投影至当前图像帧时,为了确定哪一个车道线与当前图像帧中的车道线G匹配,可计算车道线G的两个端点到车道线E的投影直线的距离的平均值,以及车道线G与车道线E的投影直线之间的重叠程度,并计算车道线G的两个端点到车道线F的投影直线的距离的平均值,以及车道线G与车道线F的投影直线之间的重叠程度,并以距离和重叠程度为综合距离,将综合距离最小(例如,若车道线E的重叠程度和车道线F的重叠程度相同,则距离较小的车道线为综合距离小的车道线等等)的车道线确定为与车道线G匹配的车道线,该车道线的地图点则与车道线G的特征点相匹配。
具体地,重叠程度的计算过程如图6所示,图6为本申请实施例提供的计算重叠程度的一个示意图,设有当前图像帧的车道线JK,矢量地图的车道线的投影直线PQ,端点J在投影直线PQ上的垂足为U,端点K在在投影直线PQ上的垂足为V,故车道线JK和投影直线PQ之间的重叠程度如公式(5)所示:
Figure PCTCN2022089007-appb-000002
上式中,l overlap为重叠程度,d UV为线段UV的长度,d UV∩PQ为线段UV与线段PQ之 间重叠部分的长度。基于公式(5)可知,图6中从左往右的重叠程度依次为1、d PV/d UV、d PQ/d UV和0。
在另一种可能的实现方式中,可在矢量地图中划定一个包含终端设备的区域。然后,根据当前图像帧的位姿,对当前图像帧的第一特征点在当前图像帧中的位置,进行坐标变换计算,从而得到第一特征点在矢量地图对应的三维坐标系中的位置,该过程相当于根据当前图像帧的位姿将当前图像帧的第一特征点投影至矢量地图对应的三维坐标中。由于当前图像帧的第一特征点包括各类物体的特征点,矢量地图的该区域内的多个地图点也包含各类物体的地图点,因此,可通过最近邻算法,对第一特征点以及这部分地图点在矢量地图对应的三维坐标中的位置进行计算,以将第一特征点与这部分地图点进行同类物体之间的匹配,从而在这部分地图点中,确定与第一特征点匹配的第一地图点。
在另一种可能的实现方式中,可在矢量地图中划定一个包含终端设备的区域,并根据当前图像帧的位姿对该区域中的多个地图点在矢量地图对应的三维坐标系中的位置,进行坐标变换计算,从而得到这部分地图点在终端设备对应的三维坐标系中的位置。与此同时,还可根据当前图像帧的位姿,对当前图像帧的第一特征点在当前图像帧中的位置,进行坐标变换计算,从而得到第一特征点在终端设备对应的三维坐标系中的位置。由于当前图像帧的第一特征点包括各类物体的特征点,矢量地图的该区域内的多个地图点也包含各类物体的地图点,因此,可通过最近邻算法,对第一特征点以及这部分地图点在终端设备对应的三维坐标中的位置进行计算,以将第一特征点与这部分地图点进行同类物体之间的匹配,从而在这部分地图点中,确定与第一特征点匹配的第一地图点。
以上三种实现方式中,均是将当前图像帧中所有物体的特征点以及矢量地图的划分区域中所有物体的地图点设置在某一个坐标系中,从而完成特征点与地图点之间的匹配。此外,还可将某些类别的物体(例如,红绿灯、灯杆、指示牌等等)的特征点和地图点设置于某一个坐标系中(例如,当前图像帧)中进行匹配,并将其它类别的物体(例如,车道线)的特征点和地图点设置于另一坐标系中(例如,终端设备对应的三维坐标系)进行匹配。
值得注意的是,在得到与第一特征点匹配的第一地图点后,相当于得到了第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离,该距离包括以下至少一种:1、第一特征点在当前图像帧中的位置,与第一地图点在当前图像帧中的位置之间的距离。2、第一特征点在矢量地图对应的三维坐标系中的位置,与第一地图点在矢量地图对应的三维坐标系中的位置之间的距离。3、第一特征点在终端设备对应的三维坐标系中的位置,与第一地图点在终端设备对应的三维坐标系中的位置之间的距离。
为了进一步理解上述说明,下文结合一个例子进行说明。设当前图像帧中包含灯杆W1、指示牌W2、车道线W3和车道线W4,矢量地图中的灯杆W5与灯杆W1匹配,矢量地图中的指示牌W6与指示牌W2匹配,矢量地图中的车道线W7与车道线W3匹配,矢量地图中的车道线W8与车道线W4匹配。当第一坐标系的含义不同时,将存在以下多种情况:
情况一:当第一坐标系为当前图像帧时,则第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离即为第一特征点在当前图像帧中的位置与第一地图点在当前图像帧中的位置之间的距离,包括:投影至当前图像帧后,灯杆W1的两个端点到灯杆 W5的投影直线的距离的平均值,指示牌W2的矩形框上四个顶点到指示牌W6的矩形框中两条平行边的投影直线的距离的平均值,车道线W3与车道线W7之间的综合距离,以及车道线W4与车道线W8之间的综合距离。
情况二:当第一坐标系包括当前图像帧和终端设备对应的三维坐标系时,则第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离包括:第一特征点在当前图像帧中的位置与第一地图点在当前图像帧中的位置之间的距离,以及第一特征点在终端设备对应的三维坐标系中的位置,与第一地图点在终端设备对应的三维坐标系中的位置之间的距离。其中,第一特征点在当前图像帧中的位置与第一地图点在当前图像帧中的位置之间的距离包括:投影至当前图像帧后,灯杆W1的两个端点到灯杆W5的投影直线的距离的平均值,以及指示牌W2的矩形框上四个顶点到指示牌W6的矩形框中两条平行边的投影直线的距离的平均值。第一特征点在终端设备对应的三维坐标系中的位置,与第一地图点在终端设备对应的三维坐标系中的位置之间的距离包括:投影至终端设备对应的三维坐标系后,车道线W3与车道线W7之间的综合距离,以及车道线W4与车道线W8之间的综合距离。
同样地,还存在情况三(当第一坐标系为矢量地图对应的三维坐标系时)、情况四(当第一坐标系为终端设备对应的三维坐标系时)、情况五(当第一坐标系包括当前图像帧和矢量地图对应的三维坐标系时)、情况六(当第一坐标系包括终端设备对应的三维坐标系和矢量地图对应的三维坐标系)、以及情况七(当第一坐标系包括当前图像帧、终端设备对应的三维坐标系和矢量地图对应的三维坐标系时),也可参照情况一和情况二的相关说明,此处不再赘述。
203、根据上一次优化后的其它图像帧的位姿,从矢量地图中,获取与第二特征点匹配的第二地图点。
经过步骤201得到上一次优化后的其它图像帧的位姿后,即得到用于当前次优化的其它图像帧的位姿初始值,可基于该位姿从终端设备内部矢量地图中,获取与第二特征点匹配的第二地图点。
在得到与第二特征点匹配的第二地图点后,相当于得到了第二特征点在第二坐标系中的位置与第二地图点在第二坐标系中的位置之间的距离,该距离包括以下至少一种:1、第二特征点在其它图像帧中的位置,与第二地图点在其它图像帧中的位置之间的距离。2、第二特征点在矢量地图对应的三维坐标系中的位置,与第二地图点在矢量地图对应的三维坐标系中的位置之间的距离。3、第二特征点在终端设备对应的三维坐标系中的位置,与第二地图点在终端设备对应的三维坐标系中的位置之间的距离。
需要说明的是,获取第二地图点的过程的介绍可参考步骤202中获取第一地图点的过程的相关说明部分,此处不再赘述。进一步地,第二特征点在第二坐标系中的位置与第二地图点在第二坐标系中的位置之间的距离,也可参考步骤202中,第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离的相关说明部分,此处不再赘述。
204、根据目标函数对当前图像帧的位姿进行调整,得到当前次优化后的当前图像帧的位姿,作为终端设备的定位结果,目标函数包括第一特征点与第一地图点之间的第一匹配误差,以及第二特征点与第二地图点之间的第二匹配误差。
在得到与第一特征点匹配的第一地图点和与第二特征点匹配的第二地图点后,可根据第一特征点与第一地图点之间的第一匹配误差,以及第二特征点与第二地图点之间的第二匹配 误差构建的目标函数,对当前图像帧的位姿进行调整,即对当前图像帧的位姿进行优化,得到当前次优化后的当前图像帧的位姿,作为终端设备的定位结果。
具体地,可先根据第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离,得到第一匹配误差的初始值。依旧如上述例子进行说明,第一匹配误差的初始值可通过公式(6)得到:
Figure PCTCN2022089007-appb-000003
上式中,第一匹配误差通过Huber ε1和Huber ε2确定,Huber ε1为参数是ε1的Huber损失函数,Huber ε2为参数是ε2的Huber损失函数,β为预置的参数,d pp为当前图像帧中灯杆这类物体对应的距离,
Figure PCTCN2022089007-appb-000004
为当前图像帧中第i个灯杆与相匹配的灯杆之间的距离,
Figure PCTCN2022089007-appb-000005
为第i个灯杆的两个端点到相匹配的灯杆的投影直线的距离,d pl为当前图像帧中红绿灯(或指示牌)这两类物体对应的距离,
Figure PCTCN2022089007-appb-000006
为当前图像帧中第i个红绿灯(或指示牌)与相匹配的红绿灯(或指示牌)之间的距离,
Figure PCTCN2022089007-appb-000007
Figure PCTCN2022089007-appb-000008
为第i个红绿灯(或指示牌)的四个顶点到相匹配的红绿灯的矩形框中两条平行边的投影直线的距离,d pH为当前图像帧中车道线这类物体对应的综合距离,
Figure PCTCN2022089007-appb-000009
为当前图像帧中第i个车道线与相匹配的车道线之间的综合距离,
Figure PCTCN2022089007-appb-000010
为第i个车道线与相匹配的车道线之间的距离,
Figure PCTCN2022089007-appb-000011
为第i个车道线与相匹配的车道线之间的重合程度,
Figure PCTCN2022089007-appb-000012
为第i个车道线的两个端点到相匹配的车道线的投影直线的距离。
进一步地,还可根据第二特征点在第二坐标系中的位置与第二地图点在第二坐标系中的位置之间的距离进行计算,得到第二匹配误差的初始值。依旧如上述例子,第二匹配误差的初始值也可通过公式(6)得到,此处不再赘述。
得到第一匹配误差的初始值以及第二匹配误差的初始值后,可将这部分初始值输入目标 函数,并对目标函数进行迭代求解,直至满足预置的迭代条件,得到当前次优化后的当前图像帧的位姿。基于公式(6),目标函数可通过公式(7)表示:
Figure PCTCN2022089007-appb-000013
上式中,在当前图像帧和其它图像帧中,
Figure PCTCN2022089007-appb-000014
为第i个图像帧中灯杆这类物体对应的距离,
Figure PCTCN2022089007-appb-000015
为第i个图像帧中红绿灯(或指示牌)这两类物体对应的距离,
Figure PCTCN2022089007-appb-000016
为第i个图像帧中车道线这类物体对应的距离。
应理解,本实施例仅以公式(6)和公式(7)进行示意性说明,并不对匹配误差的计算方式和目标函数的表达方式构成限制。
在对目标函数进行迭代求解的过程中,完成第一次迭代后,即将第一匹配误差的初始值以及第二匹配误差的初始值输入目标函数进行求解后,可得到第一次迭代得到的当前图像帧的位姿以及第一次迭代得到的其它图像帧的位姿。然后,根据第一次迭代得到的当前图像帧的位姿以及第一次迭代得到的其它图像帧的位姿进行计算,得到第一次迭代得到的帧间位姿差,若该帧间位姿差与终端设备的里程计计算的帧间位姿差之间的差值小于预置的阈值,相当于目标函数收敛,则停止迭代,并以第一次迭代得到的当前图像帧的位姿作为当前次优化后的当前图像帧的位姿,若该帧间位姿差与终端设备的里程计计算的帧间位姿差之间的差值大于或等于预置的阈值,则进行第二次迭代。
在进行第二次迭代时,可根据第一次迭代得到的当前图像帧的位姿重新确定与第一特征点匹配的第一地图点(即重新执行步骤202),并根据第一次迭代得到的其它图像帧的位姿重新确定与第二特征点匹配的第二地图点(即重新执行步骤203)。然后,计算第一特征点与第一地图点之间的第一匹配误差的第一次迭代值,第二特征点与第二地图点之间的第二匹配误差的第一次迭代值。随后,将第一匹配误差的第一次迭代值和第二匹配误差的第一次迭代值输出目标函数进行求解,可得到第二次迭代得到的当前图像帧的位姿以及第二次迭代得到的其它图像帧的位姿。接着,根据第二次迭代得到的当前图像帧的位姿以及第二次迭代得到的其它图像帧的位姿进行计算,得到第二次迭代得到的帧间位姿差,若该帧间位姿差与终端设备的里程计计算的帧间位姿差之间的差值小于预置的阈值,则停止迭代,并以第二次迭代得到的当前图像帧的位姿作为当前次优化后的当前图像帧的位姿,若该帧间位姿差与终端设备的里程计计算的帧间位姿差之间的差值大于或等于预置的阈值,则进行第三次迭代,直至迭代次数等于预置的次数,此时也认为目标函数收敛,则将最后一次迭代得到的当前图像帧的位姿作为当前次优化后的当前图像帧的位姿。
本实施例中,在得到当前图像帧和当前图像帧之前的其它图像帧后,可从矢量地图中,获取与当前图像帧的第一特征点匹配的第一地图点,以及与当前图像帧之前的其它图像帧的第二特征点匹配的第二地图点。然后,可根据第一特征点与第一地图点之间的第一匹配误差,以及第二特征点与第二地图点之间的第二匹配误差构建的目标函数,对当前图像帧的位姿进行调整,得到当前次优化后的当前图像帧的位姿。前述过程中,由于目标函数既包含当前图像帧的特征点与矢量地图的地图点之间的匹配误差,以及其它图像帧的特征点与矢量地图的地图点之间的匹配误差,故通过该目标函数对当前图像帧的位姿进行调整,不仅考虑了当前图像帧对当前图像帧的位姿的优化过程造成的影响,还考虑了其它图像帧对当前图像帧的位 姿的优化过程造成的影响,即考虑了当前图像帧以及其它图像帧之间的关联性,所考虑的因素更全面,故基于此种方式得到的终端设备的定位结果,具备更高的准确性。
更进一步地,在相关技术中,目标函数仅通过当前图像帧的特征点以及矢量地图的地图点之间的匹配误差构建,由于当前图像帧所能呈现的内容有限,在挑选与当前图像帧的特征点匹配的地图点时,往往会出现地图点较为稀疏且存在重叠的情况,故在对目标函数进行迭代求解时,无法令特征点与地图点之间的匹配误差足够小,进而影响定位结果的准确性。本实施例中,目标函数通过当前图像帧的第一特征点与矢量地图的第一地图点之间的第一匹配误差,以及其它图像帧的第二特征点与矢量地图的第二地图点之间的第二匹配误差构建,由于多个图像帧所呈现的内容通常存在较大的区别度,可避免地图点较为稀疏且存在重叠的情况出现,故在对目标函数进行迭代求解时(对多个图像帧的位姿进行联合优化),可令第一匹配误差和第二匹配误差足够小,进而提高定位结果的准确性。
更进一步地,通过分层采样所得到的当前图像帧的位姿,可作为当前次优化的当前图像帧的位姿初始值,从而提高当前次优化的收敛速度和鲁棒性。
以上是对本申请实施例提供的终端设备定位方法进行的详细说明,以下将对本申请实施例提供的终端设备定位装置进行介绍。图7为本申请实施例提供的终端设备定位装置的一个结构示意图,如图7所示,该装置包括:
第一匹配模块701,用于从矢量地图中,获取与终端设备拍摄的当前图像帧的第一特征点匹配的第一地图点;
第二匹配模块702,用于从矢量地图中,获取与当前图像帧之前的其它图像帧的第二特征点匹配的第二地图点;
调整模块703,用于根据目标函数对终端设备拍摄当前图像帧时的位姿进行调整,得到当前次调整后的终端设备拍摄当前图像帧时的位姿,作为终端设备的定位结果,目标函数包括第一特征点与第一地图点之间的第一匹配误差,以及第二特征点与第二地图点之间的第二匹配误差。
在一种可能的实现方式中,该装置还包括:获取模块700,用于获取当前图像帧的第一特征点、当前图像帧之前的其它图像帧的第二特征点、终端设备拍摄当前图像帧时的位姿和上一次调整后的终端设备拍摄其它图像帧时的位姿;第一匹配模块701,用于根据终端设备拍摄当前图像帧时的位姿,从矢量地图中,获取与第一特征点匹配的第一地图点;第二匹配模块702,用于根据上一次调整后的终端设备拍摄其它图像帧时的位姿,从矢量地图中,获取与第二特征点匹配的第二地图点。
在一种可能的实现方式中,调整模块703,用于:根据第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离进行计算,得到第一匹配误差的初始值;根据第二特征点在第二坐标系中的位置与第二地图点在第二坐标系中的位置之间的距离进行计算,得到第二匹配误差的初始值;根据第一匹配误差的初始值以及第二匹配误差的初始值,对目标函数进行迭代求解,直至满足预置的迭代条件,得到当前次调整后的终端设备拍摄当前图像帧时的位姿。
在一种可能的实现方式中,第一特征点在第一坐标系中的位置与第一地图点在第一坐标系中的位置之间的距离包括以下至少一种:第一特征点在当前图像帧中的位置,与第一地图 点在当前图像帧中的位置之间的距离;第一特征点在矢量地图对应的三维坐标系中的位置,与第一地图点在矢量地图对应的三维坐标系中的位置之间的距离;或者第一特征点在终端设备对应的三维坐标系中的位置,与第一地图点在终端设备对应的三维坐标系中的位置之间的距离。
在一种可能的实现方式中,第二特征点在第二坐标系中的位置与第二地图点在第二坐标系中的位置之间的距离包括以下至少一种:第二特征点在其它图像帧中的位置,与第二地图点在其它图像帧中的位置之间的距离;第二特征点在矢量地图对应的三维坐标系中的位置,与第二地图点在矢量地图对应的三维坐标系中的位置之间的距离;或者第二特征点在终端设备对应的三维坐标系中的位置,与第二地图点在终端设备对应的三维坐标系中的位置之间的距离。
在一种可能的实现方式中,迭代条件为:对于任意一次迭代,若该次迭代所得到的帧间位姿差与终端设备计算的帧间位姿差之间的差值小于预置的阈值,则停止迭代,该次迭代得到的帧间位姿差根据该次迭代得到的终端设备拍摄当前图像帧时的位姿以及该次迭代得到的终端设备拍摄其它图像帧时的位姿确定;若差值大于或等于阈值,则执行下一次迭代,直至迭代次数等于预置的次数。
在一种可能的实现方式中,其它图像帧的数量根据终端设备的速度确定。
在一种可能的实现方式中,获取模块700,用于:根据上一次调整后的终端设备拍摄其它图像帧时的位姿和终端设备计算的帧间位姿差,计算终端设备拍摄当前图像帧时的预测位姿;对终端设备拍摄当前图像帧时的预测位姿进行分层采样,得到终端设备拍摄当前图像帧时的位姿。
在一种可能的实现方式中,若终端设备拍摄当前图像帧时的位姿包含横坐标、纵坐标和航向角,则获取模块700,用于:获取第三地图点在矢量地图对应的三维坐标系中的位置和第一特征点在当前图像帧中的位置;保持终端设备拍摄当前图像帧时的预测位姿的航向角不变,改变终端设备拍摄当前图像帧时的预测位姿的横坐标和纵坐标,得到第一候选位姿;根据第一候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在预置的图像坐标系中的位置;保持终端设备拍摄当前图像帧时的预测位姿的横坐标和纵坐标不变,改变终端设备拍摄当前图像帧时的预测位姿的航向角,得到第二候选位姿;根据第二候选位姿对第一特征点在当前图像帧中的位置进行变换,得到第一特征点在图像坐标系中的位置;根据第三地图点在图像坐标系中的位置和第一特征点在图像坐标系中的位置之间的距离的大小,从第一候选位姿和第二候选位姿的组合中确定终端设备拍摄当前图像帧时的位姿。通过前述的位姿采样方式,可有效减少位姿采样过程中所需的计算量。
在一种可能的实现方式中,若终端设备拍摄当前图像帧时的位姿包含横坐标、纵坐标、竖坐标、航向角、滚动角和俯仰角,则获取模块700,用于:获取第三地图点在矢量地图对应的三维坐标系中的位置和第一特征点在当前图像帧中的位置;保持终端设备拍摄当前图像帧时的预测位姿的航向角、滚动角、俯仰角和竖坐标不变,改变终端设备拍摄当前图像帧时的预测位姿的横坐标和纵坐标,得到第一候选位姿;根据第一候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在预置的图像坐标系中的位置;保持终端设备拍摄当前图像帧时的预测位姿的横坐标、纵坐标、竖坐标、滚动角和俯仰角不 变,改变终端设备拍摄当前图像帧时的预测位姿的航向角,得到第二候选位姿;根据第二候选位姿对第一特征点在当前图像帧中的位置进行变换,得到第一特征点在图像坐标系中的位置;根据第三地图点在图像坐标系中的位置和第一特征点在图像坐标系中的位置之间的距离的大小,从第一候选位姿和第二候选位姿的组合中确定第三候选位姿;保持第三候选位姿的预测位姿的横坐标、纵坐标、航向角和滚动角不变,改变第三候选位姿的俯仰角和竖坐标,得到第四候选位姿;根据第四候选位姿对第三地图点在矢量地图对应的三维坐标系中的位置进行变换,得到第三地图点在当前图像帧中的位置;根据第一特征点在当前图像帧中的位置和第三地图点在当前图像帧中的位置之间的距离的大小,从第四候选位姿确定终端设备拍摄当前图像帧时的位姿。通过前述的位姿采样方式,可有效减少位姿采样过程中所需的计算量。
需要说明的是,上述装置各模块/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其带来的技术效果与本申请方法实施例相同,具体内容可参考本申请实施例前述所示的方法实施例中的叙述,此处不再赘述。
图8为本申请实施例提供的终端设备定位装置的另一结构示意图。如图8所示,本申请实施例中计算机一个实施例可以包括一个或一个以***处理器801,存储器802,输入输出接口803,有线或无线网络接口804,电源805。
存储器802可以是短暂存储或持久存储。更进一步地,中央处理器801可以配置为与存储器802通信,在计算机上执行存储器802中的一系列指令操作。
本实施例中,中央处理器801可以执行前述图2所示实施例中的方法步骤,具体此处不再赘述。
本实施例中,中央处理器801中的具体功能模块划分可以与前述图7中所描述的获取模块、第一匹配模块、第二匹配模块和优化模块等模块的划分方式类似,此处不再赘述。
本申请实施例还涉及一种计算机存储介质,包括计算机可读指令,当所述计算机可读指令被执行时,实现如图2所述的方法。
本申请实施例还涉及一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如图2所述的方法。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的***,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以 采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (19)

  1. 一种终端设备定位方法,其特征在于,所述方法包括:
    从矢量地图中,获取与终端设备拍摄的当前图像帧的第一特征点匹配的第一地图点;
    从所述矢量地图中,获取与所述当前图像帧之前的其它图像帧的第二特征点匹配的第二地图点;
    根据目标函数对所述终端设备拍摄所述当前图像帧时的位姿进行调整,得到当前次调整后的终端设备拍摄所述当前图像帧时的位姿,作为所述终端设备的定位结果,所述目标函数包括所述第一特征点与所述第一地图点之间的第一匹配误差,以及所述第二特征点与所述第二地图点之间的第二匹配误差。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取所述当前图像帧的第一特征点、所述当前图像帧之前的其它图像帧的第二特征点、所述终端设备拍摄所述当前图像帧时的位姿和上一次调整后的终端设备拍摄所述其它图像帧时的位姿;
    所述从矢量地图中,获取与所述第一特征点匹配的第一地图点包括:
    根据所述终端设备拍摄所述当前图像帧时的位姿,从矢量地图中,获取与所述第一特征点匹配的第一地图点;
    所述从所述矢量地图中,获取与所述第二特征点匹配的第二地图点包括:
    根据所述上一次调整后的终端设备拍摄所述其它图像帧时的位姿,从所述矢量地图中,获取与所述第二特征点匹配的第二地图点。
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据目标函数对所述终端设备拍摄所述当前图像帧时的位姿进行调整,得到当前次调整后的终端设备拍摄所述当前图像帧时的位姿包括:
    根据所述第一特征点在第一坐标系中的位置与所述第一地图点在所述第一坐标系中的位置之间的距离进行计算,得到所述第一匹配误差的初始值;
    根据所述第二特征点在第二坐标系中的位置与所述第二地图点在所述第二坐标系中的位置之间的距离进行计算,得到所述第二匹配误差的初始值;
    根据所述第一匹配误差的初始值以及所述第二匹配误差的初始值,对所述目标函数进行迭代求解,直至满足预置的迭代条件,得到当前次调整后的终端设备拍摄所述当前图像帧时的位姿。
  4. 根据权利要求3所述的方法,其特征在于,所述第一特征点在第一坐标系中的位置与所述第一地图点在所述第一坐标系中的位置之间的距离包括以下至少一种:
    所述第一特征点在当前图像帧中的位置,与所述第一地图点在所述当前图像帧中的位置之间的距离;
    所述第一特征点在矢量地图对应的三维坐标系中的位置,与所述第一地图点在所述矢量地图对应的三维坐标系中的位置之间的距离;或者
    所述第一特征点在终端设备对应的三维坐标系中的位置,与所述第一地图点在所述终端设备对应的三维坐标系中的位置之间的距离。
  5. 根据权利要求3所述的方法,其特征在于,所述第二特征点在第二坐标系中的位置与 所述第二地图点在所述第二坐标系中的位置之间的距离包括以下至少一种:
    所述第二特征点在所述其它图像帧中的位置,与所述第二地图点在所述其它图像帧中的位置之间的距离;
    所述第二特征点在矢量地图对应的三维坐标系中的位置,与所述第二地图点在所述矢量地图对应的三维坐标系中的位置之间的距离;或者
    所述第二特征点在终端设备对应的三维坐标系中的位置,与所述第二地图点在所述终端设备对应的三维坐标系中的位置之间的距离。
  6. 根据权利要求3至5任意一项所述的方法,其特征在于,所述迭代条件为:对于任意一次迭代,若该次迭代所得到的帧间位姿差与所述终端设备计算的帧间位姿差之间的差值小于预置的阈值,则停止迭代,该次迭代得到的帧间位姿差根据该次迭代得到的终端设备拍摄所述当前图像帧时的位姿以及该次迭代得到的终端设备拍摄所述其它图像帧时的位姿确定,所述帧间位姿差为在所述当前图像帧和所述其它图像帧中,所述终端设备拍摄相邻两个图像帧之间的位姿差;若所述差值大于或等于所述阈值,则执行下一次迭代,直至迭代次数等于预置的次数。
  7. 根据权利要求1至6任意一项所述的方法,其特征在于,所述其它图像帧的数量根据所述终端设备的速度确定。
  8. 根据权利要求2至7任意一项所述的方法,其特征在于,所述获取所述终端设备拍摄所述当前图像帧时的位姿包括:
    根据所述上一次调整后的终端设备拍摄所述其它图像帧时的位姿和所述终端设备计算的帧间位姿差,计算所述终端设备拍摄所述当前图像帧时的预测位姿;
    对所述终端设备拍摄所述当前图像帧时的预测位姿进行分层采样,得到所述终端设备拍摄所述当前图像帧时的位姿。
  9. 一种终端设备定位装置,其特征在于,所述装置包括:
    第一匹配模块,用于从矢量地图中,获取与终端设备拍摄的当前图像帧的第一特征点匹配的第一地图点;
    第二匹配模块,用于从所述矢量地图中,获取与所述当前图像帧之前的其它图像帧的第二特征点匹配的第二地图点;
    调整模块,用于根据目标函数对所述终端设备拍摄所述当前图像帧时的位姿进行调整,得到当前次调整后的终端设备拍摄所述当前图像帧时的位姿,作为所述终端设备的定位结果,所述目标函数包括所述第一特征点与所述第一地图点之间的第一匹配误差,以及所述第二特征点与所述第二地图点之间的第二匹配误差。
  10. 根据权利要求9所述的装置,其特征在于,所述装置还包括:
    获取模块,用于获取所述当前图像帧的第一特征点、所述当前图像帧之前的其它图像帧的第二特征点、所述终端设备拍摄所述当前图像帧时的位姿和上一次调整后的终端设备拍摄所述其它图像帧时的位姿;
    所述第一匹配模块,用于根据所述终端设备拍摄所述当前图像帧时的位姿,从矢量地图中,获取与所述第一特征点匹配的第一地图点;
    所述第二匹配模块,用于根据所述上一次调整后的终端设备拍摄所述其它图像帧时的位 姿,从所述矢量地图中,获取与所述第二特征点匹配的第二地图点。
  11. 根据权利要求9或10所述的装置,其特征在于,所述调整模块,用于:
    根据所述第一特征点在第一坐标系中的位置与所述第一地图点在所述第一坐标系中的位置之间的距离进行计算,得到所述第一匹配误差的初始值;
    根据所述第二特征点在第二坐标系中的位置与所述第二地图点在所述第二坐标系中的位置之间的距离进行计算,得到所述第二匹配误差的初始值;
    根据所述第一匹配误差的初始值以及所述第二匹配误差的初始值,对所述目标函数进行迭代求解,直至满足预置的迭代条件,得到当前次调整后的终端设备拍摄所述当前图像帧时的位姿。
  12. 根据权利要求11所述的装置,其特征在于,所述第一特征点在第一坐标系中的位置与所述第一地图点在所述第一坐标系中的位置之间的距离包括以下至少一种:
    所述第一特征点在当前图像帧中的位置,与所述第一地图点在所述当前图像帧中的位置之间的距离;
    所述第一特征点在矢量地图对应的三维坐标系中的位置,与所述第一地图点在所述矢量地图对应的三维坐标系中的位置之间的距离;或者
    所述第一特征点在终端设备对应的三维坐标系中的位置,与所述第一地图点在所述终端设备对应的三维坐标系中的位置之间的距离。
  13. 根据权利要求11所述的装置,其特征在于,所述第二特征点在第二坐标系中的位置与所述第二地图点在所述第二坐标系中的位置之间的距离包括以下至少一种:
    所述第二特征点在其它图像帧中的位置,与所述第二地图点在所述其它图像帧中的位置之间的距离;
    所述第二特征点在矢量地图对应的三维坐标系中的位置,与所述第二地图点在所述矢量地图对应的三维坐标系中的位置之间的距离;或者
    所述第二特征点在终端设备对应的三维坐标系中的位置,与所述第二地图点在所述终端设备对应的三维坐标系中的位置之间的距离。
  14. 根据权利要求11至13任意一项所述的装置,其特征在于,所述迭代条件为:对于任意一次迭代,若该次迭代所得到的帧间位姿差与所述终端设备计算的帧间位姿差之间的差值小于预置的阈值,则停止迭代,该次迭代得到的帧间位姿差根据该次迭代得到的终端设备拍摄所述当前图像帧时的位姿以及该次迭代得到的终端设备拍摄所述其它图像帧时的位姿确定,所述帧间位姿差为在所述当前图像帧和所述其它图像帧中,所述终端设备拍摄相邻两个图像帧之间的位姿差;若所述差值大于或等于所述阈值,则执行下一次迭代,直至迭代次数等于预置的次数。
  15. 根据权利要求9至14任意一项所述的装置,其特征在于,所述其它图像帧的数量根据所述终端设备的速度确定。
  16. 根据权利要求10至15任意一项所述的装置,其特征在于,所述获取模块,用于:
    根据所述上一次调整后的终端设备拍摄所述其它图像帧时的位姿和所述终端设备计算的帧间位姿差,计算所述终端设备拍摄所述当前图像帧时的预测位姿;
    对所述终端设备拍摄所述当前图像帧时的预测位姿进行分层采样,得到所述终端设备拍 摄所述当前图像帧时的位姿。
  17. 一种终端设备定位装置,其特征在于,包括存储器和处理器;所述存储器存储有代码,所述处理器被配置为执行所述代码,当所述代码被执行时,所述终端设备定位装置执行如权利要求1至8任意一项所述的方法。
  18. 一种计算机存储介质,其特征在于,所述计算机存储介质存储有计算机程序,该程序由计算机执行时,使得所述计算机实施权利要求1至8任意一项所述的方法。
  19. 一种计算机程序产品,其特征在于,所述计算机程序产品存储有指令,所述指令在由计算机执行时,使得所述计算机实施权利要求1至8任意一项所述的方法。
PCT/CN2022/089007 2021-04-27 2022-04-25 一种终端设备定位方法及其相关设备 WO2022228391A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22794861.9A EP4322020A1 (en) 2021-04-27 2022-04-25 Terminal device positioning method and related device therefor
US18/494,547 US20240062415A1 (en) 2021-04-27 2023-10-25 Terminal device localization method and related device therefor

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110460636.4A CN113239072A (zh) 2021-04-27 2021-04-27 一种终端设备定位方法及其相关设备
CN202110460636.4 2021-04-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/494,547 Continuation US20240062415A1 (en) 2021-04-27 2023-10-25 Terminal device localization method and related device therefor

Publications (1)

Publication Number Publication Date
WO2022228391A1 true WO2022228391A1 (zh) 2022-11-03

Family

ID=77129462

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/089007 WO2022228391A1 (zh) 2021-04-27 2022-04-25 一种终端设备定位方法及其相关设备

Country Status (4)

Country Link
US (1) US20240062415A1 (zh)
EP (1) EP4322020A1 (zh)
CN (1) CN113239072A (zh)
WO (1) WO2022228391A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239072A (zh) * 2021-04-27 2021-08-10 华为技术有限公司 一种终端设备定位方法及其相关设备
CN113838129B (zh) * 2021-08-12 2024-03-15 高德软件有限公司 一种获得位姿信息的方法、装置以及***
CN114549633A (zh) * 2021-09-16 2022-05-27 北京小米移动软件有限公司 位姿检测方法、装置、电子设备和存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017022033A1 (ja) * 2015-07-31 2017-02-09 富士通株式会社 画像処理装置、画像処理方法および画像処理プログラム
CN111780764A (zh) * 2020-06-30 2020-10-16 杭州海康机器人技术有限公司 一种基于视觉地图的视觉定位方法、装置
CN113239072A (zh) * 2021-04-27 2021-08-10 华为技术有限公司 一种终端设备定位方法及其相关设备

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100235063B1 (ko) * 1993-07-16 1999-12-15 전주범 블럭정합기를 이용한 대칭블럭 움직임 추정방법 및 장치
CN107610175A (zh) * 2017-08-04 2018-01-19 华南理工大学 基于半直接法和滑动窗口优化的单目视觉slam算法
CN109345588B (zh) * 2018-09-20 2021-10-15 浙江工业大学 一种基于Tag的六自由度姿态估计方法
CN112640417B (zh) * 2019-08-09 2021-12-31 华为技术有限公司 匹配关系确定方法及相关装置
CN112444242B (zh) * 2019-08-31 2023-11-10 北京地平线机器人技术研发有限公司 一种位姿优化方法及装置
CN111750864B (zh) * 2020-06-30 2022-05-13 杭州海康机器人技术有限公司 一种基于视觉地图的重定位方法和装置
CN111780763B (zh) * 2020-06-30 2022-05-06 杭州海康机器人技术有限公司 一种基于视觉地图的视觉定位方法、装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017022033A1 (ja) * 2015-07-31 2017-02-09 富士通株式会社 画像処理装置、画像処理方法および画像処理プログラム
CN111780764A (zh) * 2020-06-30 2020-10-16 杭州海康机器人技术有限公司 一种基于视觉地图的视觉定位方法、装置
CN113239072A (zh) * 2021-04-27 2021-08-10 华为技术有限公司 一种终端设备定位方法及其相关设备

Also Published As

Publication number Publication date
CN113239072A (zh) 2021-08-10
EP4322020A1 (en) 2024-02-14
US20240062415A1 (en) 2024-02-22

Similar Documents

Publication Publication Date Title
CN109166149B (zh) 一种融合双目相机与imu的定位与三维线框结构重建方法与***
WO2022228391A1 (zh) 一种终端设备定位方法及其相关设备
CN109544636B (zh) 一种融合特征点法和直接法的快速单目视觉里程计导航定位方法
CN106558080B (zh) 一种单目相机外参在线标定方法
CN108229416B (zh) 基于语义分割技术的机器人slam方法
CN111968177B (zh) 一种基于固定摄像头视觉的移动机器人定位方法
CN110874100A (zh) 用于使用视觉稀疏地图进行自主导航的***和方法
WO2022017131A1 (zh) 点云数据的处理方法、智能行驶控制方法及装置
CN110487286B (zh) 基于点特征投影与激光点云融合的机器人位姿判断方法
CN111998862B (zh) 一种基于bnn的稠密双目slam方法
CN103839277A (zh) 一种户外大范围自然场景的移动增强现实注册方法
CN110827353B (zh) 一种基于单目摄像头辅助的机器人定位方法
WO2019075948A1 (zh) 移动机器人的位姿估计方法
CN112507056B (zh) 一种基于视觉语义信息的地图构建方法
CN111797688A (zh) 一种基于光流和语义分割的视觉slam方法
Ma et al. Crlf: Automatic calibration and refinement based on line feature for lidar and camera in road scenes
CN110260866A (zh) 一种基于视觉传感器的机器人定位与避障方法
CN112767546B (zh) 移动机器人基于双目图像的视觉地图生成方法
CN113393524B (zh) 一种结合深度学习和轮廓点云重建的目标位姿估计方法
CN114494150A (zh) 一种基于半直接法的单目视觉里程计的设计方法
Jeon et al. Efghnet: A versatile image-to-point cloud registration network for extreme outdoor environment
EP4145392A1 (en) Method and apparatus for determining three-dimensional information of target object
KR102372298B1 (ko) 이동체의 전방향에 위치한 적어도 하나의 물체에 대한 거리를 획득하는 방법 및 이를 이용한 비전 장치
Saxena et al. Generalizable pose estimation using implicit scene representations
CN116468786B (zh) 一种面向动态环境的基于点线联合的语义slam方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22794861

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022794861

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022794861

Country of ref document: EP

Effective date: 20231110

NENP Non-entry into the national phase

Ref country code: DE