CN113920263A

CN113920263A - Map construction method, map construction device, map construction equipment and storage medium

Info

Publication number: CN113920263A
Application number: CN202111210541.3A
Authority: CN
Inventors: 张壮; 孙瀚; 姜翰青; 章国锋; 鲍虎军
Original assignee: Zhejiang Shangtang Technology Development Co Ltd
Current assignee: Zhejiang Shangtang Technology Development Co Ltd
Priority date: 2021-10-18
Filing date: 2021-10-18
Publication date: 2022-01-11
Also published as: WO2023065657A1

Abstract

The embodiment of the application provides a map construction method, a map construction device, map construction equipment and a storage medium, wherein the method comprises the following steps: acquiring first image data of a region to be drawn and navigation data corresponding to the first image data; determining a first feature map based on the first image data and the navigation data; updating the first feature map based on second image data of the area to be drawn to obtain a second feature map; the first image data and the second image data are acquired in different modes; and constructing a three-dimensional map of the area to be drawn based on the second image data, the second feature map and the first image data.

Description

Map construction method, map construction device, map construction equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of visual positioning, and relates to but is not limited to a map construction method, a map construction device, map construction equipment and a storage medium.

Background

With the continuous development of computer and communication technologies, maps provide great help for people going out. In the related technology, the map is generated by adopting methods of conventional acquisition, manual processing and the like, so that the created map has limited precision and cannot well meet the requirements of users.

Disclosure of Invention

The embodiment of the application provides a technical scheme for map construction.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides a map construction method, which comprises the following steps:

acquiring first image data of a region to be drawn and navigation data corresponding to the first image data;

determining a first feature map based on the first image data and the navigation data;

updating the first feature map based on second image data of the area to be drawn to obtain a second feature map; the first image data and the second image data are acquired in different modes;

and constructing a three-dimensional map of the area to be drawn based on the second image data, the second feature map and the first image data.

In some embodiments, said determining a first feature map based on said first image data and said navigation data comprises: performing feature extraction on the first image data to obtain image feature points and description information of the image feature points; matching different images in the first image data based on the description information of the image feature points to obtain an image relation library representing the matching relation between the different images; determining the first feature map based on the image relational library and the navigation data. In this way, by combining the feature point coordinates in the image relational database with the navigation data, the first feature map representing the spatial position of the image feature point in the image data can be constructed easily and quickly.

In some embodiments, the determining the first feature map based on the image relational library and the navigation data comprises: determining a first pose of an acquisition device of the first image data based on the navigation data and a preset first acquisition parameter; wherein, the presetting of the first acquisition parameter comprises: acquiring external parameters of the first image data and acquiring external parameters of the navigation data; determining the first feature map based on image feature points, the first pose and second acquisition parameters in the relational database; wherein the second acquisition parameters include intrinsic parameters for acquiring the first image data. Therefore, the feature map of the image data can be quickly constructed by combining the feature matching relationship with the initial image pose provided by the navigation data based on the triangulation principle.

In some embodiments, the determining the first feature map based on image feature points in the relational database, the first pose, and second acquisition parameters comprises: triangularizing the first true value position of the image feature point in the image to obtain a three-dimensional coordinate of the image feature point in a world coordinate system based on the first pose and the second acquisition parameter; and constructing the first feature map based on the three-dimensional coordinates.

In some embodiments, said constructing said first feature map based on said three-dimensional coordinates comprises: constructing an initial feature map representing the spatial position of the image feature points based on the three-dimensional coordinates; determining a first predicted position of the spatial position projected to the first image data based on the transformation parameter of the navigation data to the camera coordinate system and the second acquisition parameter; determining a first difference between the first true position and the first predicted position; and adjusting the spatial position of the feature point in the initial feature map based on the first difference to obtain the first feature map. Therefore, the spatial position and the image pose of the feature point in the initial feature map are optimized through the first difference value, and the accuracy of the obtained first feature map is higher.

In some embodiments, the updating the first feature map based on the second image data of the area to be drawn to obtain a second feature map includes: updating an image relational database based on the matching relationship between the second image data and the first image data to obtain an updated image relational database; determining an image to be registered in the updated image relational database; target feature points corresponding to the three-dimensional points in the first feature map exist in the image to be registered; determining the image pose of the image to be registered based on the target feature points; registering the image pose to the first feature map to obtain a registered feature map; adjusting the registered feature map based on other feature points in the image to be registered and the second image data to obtain the second feature map; and the other feature points are feature points except the target feature points in the image to be registered. Therefore, the first feature map is updated by adopting the second image data with different sources, and the second feature map with higher coverage is obtained.

In some embodiments, the registering the image pose into the first feature map to obtain a registered feature map further includes: determining the number of target feature points included in each image to be registered; determining a registration sequence of each image to be registered based on the number; and registering the image pose of each image to be registered into the first feature map based on the registration sequence to obtain the registered feature map. Therefore, the registration sequence of the images to be registered is determined according to the number of the target features, and the accuracy of the registered image poses can be improved.

In some embodiments, the adjusting the registered feature map based on other feature points in the image to be registered and the second image data to obtain the second feature map includes: sampling the other characteristic points to obtain sampling characteristic points; triangularization is carried out on the sampling characteristic points, and three-dimensional coordinates of the sampling characteristic points in a world coordinate system are determined; and adjusting the registered feature map based on the three-dimensional coordinates of the sampling feature points and the second image data to obtain the second feature map. Therefore, by uniformly sampling other feature points in the sampling area, the excessive concentration of image features can be reduced, and the complexity of global optimization is reduced.

In some embodiments, the adjusting the registered feature map based on the three-dimensional coordinates of the sampling feature point and the second image data to obtain the second feature map includes: determining a second prediction position of projecting the three-dimensional coordinates of the sampling characteristic points to first image data and a third prediction position of projecting the three-dimensional coordinates of the sampling characteristic points to second image data based on the conversion parameters of the navigation data to a camera coordinate system; determining a second difference value between the second predicted position and a second true position of the sampling feature point in the image, and a third difference value between the third predicted position and the second true position; and adjusting the spatial position of the feature point in the registered feature map based on a second difference value and the third difference value to obtain the second feature map. In this way, the spatial position of the feature point in the first feature map is optimized through the second difference and the third difference, so that the integrity of the second feature map is higher.

In some embodiments, the constructing a three-dimensional map of the area to be rendered based on the second image data, the second feature map, and the first image data includes: performing depth processing on the second image data, the second feature map and the first image data to generate point cloud data representing the first image data and the second image data; constructing an initial grid model representing a connection relation between the point cloud data based on the point cloud data; determining the three-dimensional map based on the initial mesh model. Therefore, the high-precision characteristic of vehicle-mounted data can be retained, and the high coverage rate of aerial photography visual angles is combined, so that the accuracy of the obtained three-dimensional map is higher.

In some embodiments, the depth processing the second image data, the second feature map, and the first image data to generate point cloud data characterizing the first image data and the second image data comprises: performing depth estimation on the second image data, the second feature map and the first image data to obtain a depth map; and fusing the depth map to obtain the point cloud data. Therefore, the obtained point cloud data is richer.

In some embodiments, the constructing an initial mesh model characterizing a connection relationship between the point cloud data based on the point cloud data comprises: determining the visibility of each point in the point cloud data in the second feature map and the reprojection error of each point; determining a target point with visibility and a reprojection error meeting preset conditions in the point cloud data; and taking the target points as vertexes, and connecting the vertexes to obtain the initial mesh model. Therefore, the tetrahedron is constructed by the points in the point cloud data according to the visibility and the reprojection error of the points, so that the obtained initial mesh model can fully reflect the connection relation between target points.

In some embodiments, said determining said three-dimensional map based on said initial mesh model comprises: determining the acquisition sources of point cloud data corresponding to a plurality of vertexes of each surface in the initial mesh model; determining, based on the acquisition sources, weights characterizing penetration of each of the faces by a line of sight; and determining the three-dimensional map based on the surface with the weight smaller than a preset threshold value. In this way, by analyzing the weight of one surface of the tetrahedron penetrated by the sight line, the surface with smaller weight is selected to construct the three-dimensional map, so that the constructed three maps are smoother.

In some embodiments, the first image data acquisition device comprises an on-board camera, and/or the second image data acquisition device is an aerial device.

The embodiment of the application provides a map construction device, the device includes:

the device comprises a first acquisition module, a second acquisition module and a display module, wherein the first acquisition module is used for acquiring first image data of a region to be drawn and navigation data corresponding to the first image data;

a first determination module to determine a first feature map based on the first image data and the navigation data;

the first updating module is used for updating the first feature map based on second image data of the area to be drawn to obtain a second feature map; the first image data and the second image data are acquired in different modes;

and the first construction module is used for constructing the three-dimensional map of the area to be drawn based on the second image data, the second characteristic map and the first image data.

Correspondingly, an embodiment of the present application provides a computer storage medium, where computer-executable instructions are stored on the computer storage medium, and after being executed, the computer-executable instructions can implement the above-mentioned method steps.

An embodiment of the present application provides an electronic device, where the electronic device includes a memory and a processor, where the memory stores computer-executable instructions, and the processor can implement the steps of the method when executing the computer-executable instructions on the memory.

The embodiment of the application provides a map construction method, a map construction device and a storage medium, wherein for an area to be drawn of a three-dimensional map, firstly, a first feature map representing an image feature space pose of first image data can be quickly constructed by taking navigation data acquired aiming at the area to be drawn as an image pose of the first image data; then, the first feature map is optimized and updated based on second image data with different acquisition modes, so that information in the second feature map can be richer; and finally, two types of image data with different acquisition modes are combined with the second characteristic map, so that a three-dimensional map with higher coverage and higher precision can be constructed.

Drawings

Fig. 1 is a schematic diagram of an implementation flow of a map construction method provided in an embodiment of the present application;

fig. 2A is a schematic flowchart of another implementation of a map creation method according to an embodiment of the present application;

fig. 2B is a schematic flowchart of another implementation of the map building method according to the embodiment of the present application;

fig. 3 is a schematic diagram of an implementation flow of a map construction method provided in the embodiment of the present application;

fig. 4 is a schematic diagram of an implementation process of initially constructing a feature map according to an embodiment of the present application;

fig. 5 is a schematic flow chart illustrating an implementation process of updating a feature map according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an implementation process of building a grid model according to an embodiment of the present application;

fig. 7 is a schematic structural component diagram of a map building apparatus according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, specific technical solutions of the present invention will be described in further detail below with reference to the accompanying drawings in the embodiments of the present application. The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, so as to enable the embodiments of the application described herein to be practiced in other than the order shown or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.

1) Computer vision: the method uses a camera and a computer to replace human eyes to perform machine vision such as identification, tracking, measurement and the like on a target, and further performs image processing, so that the computer processing becomes an image more suitable for human eye observation or transmitted to an instrument for detection.

2) Structure-from-Motion (SFM): since the images are disordered, the images with overlapping relationship need to be associated, and the output of the motion structure part is an associated image set subjected to geometric verification and image projection points corresponding to the map points.

3) A characteristic map: the environment is represented by using relevant geometric features (such as points, lines, and faces), and is often used in Simultaneous Localization and Mapping (SLAM) and SFM. In the embodiment of the application, point feature representation is used, and information such as the spatial position, color, feature description and visible image of a point is included.

4) Triangular mesh: is one type of polygonal Mesh, also known as "Mesh", which is a data structure used in computer graphics to model various irregular objects. The essence is to represent a real continuous object with a large number of small triangular patches.

An exemplary application of the mapping apparatus provided by the embodiments of the present application is described below, wherein the apparatus provided by the embodiments of the present application can be implemented as various types of electronic apparatuses such as a notebook computer with an image capture function, a tablet computer, a desktop computer, a mobile apparatus (e.g., a personal digital assistant, a dedicated messaging device, a portable game device), and the like.

Next, an exemplary application when the map building apparatus is implemented as an electronic apparatus will be explained.

Fig. 1 is a schematic flow chart of an implementation of a map construction method provided in an embodiment of the present application, as shown in fig. 1, and the following steps are described in conjunction with the steps shown in fig. 1:

step S101, in a region to be drawn, acquiring first image data and navigation data corresponding to the first image data.

In some embodiments, the image capture device of the first image data may be any device or apparatus having image capture functionality; such as an in-vehicle video camera, an in-vehicle camera, or a roadside camera. The first image data of the area to be drawn is acquired, and can be acquired in real time through an image acquisition device or received image data sent by other equipment. The first image data may be an outdoor image collected in an area to be rendered, may be a simple image whose picture content includes an outdoor scene, or may be a complex image whose picture content includes an outdoor scene. The first image data may include a plurality of images, and the navigation data is data of the navigation device at the time of acquiring each image. The navigation data and the first image data may be acquired synchronously, for example, the first image data and the navigation data are acquired synchronously by using a vehicle-mounted camera and a vehicle-mounted navigation device. Or, the navigation data and the first image data may also be acquired asynchronously, for example, after acquiring the first image data for the region to be drawn, the navigation data of the region is acquired by using the navigation device; in this way, the pixel coordinates of each point in the navigation data can be taken as the pose of the point in the first image data.

Step S102, determining a first feature map based on the first image data and the navigation data.

In some embodiments, the first feature map represents spatial locations of feature points of the first image data. When the first image data is acquired, the navigation data of the navigation device is used as the initial image pose of the first image data; for a plurality of images in the first image data, determining matched feature points in the images; and constructing an image relation library based on the matched feature points among the images. Determining the pose of the acquisition device by combining the image pose of the first image data acquired by the navigation device and the external parameters of the acquisition device of the first image data; and based on the feature points, triangularization is carried out on the feature points in the first image data, so that the spatial positions of the feature points in a world coordinate system are obtained, and a first feature map representing the spatial positions of the feature points in the first image data is completely and preliminarily established.

Step S103, updating the first feature map based on the second image data of the area to be drawn to obtain a second feature map.

In some embodiments, the first image data and the second image data are acquired in different manners, including: the source of the first image data is different from the source of the second image data, the acquisition device of the second image data is different from the acquisition device of the first image data, or the acquisition angle of the first image data is different from the acquisition angle of the second image data. For example, the acquisition device of the first image data is a vehicle-mounted camera, and the acquisition device of the second image data is an aerial unmanned aerial vehicle. In some possible implementations, the first image data may be data acquired on the ground and the second image data may be data acquired in the air. And matching the first image data and the second image data with different sources, and updating the first feature map by adopting an incremental motion restoration structure method based on the matched first image data and second image data to obtain a second feature map.

And step S104, constructing a three-dimensional map of the area to be drawn based on the second image data, the second feature map and the first image data.

In some embodiments, the first image data and the second image data are converted into point cloud data by first converting the first image data and the second image data into point cloud data; then, determining whether each point of the point cloud data can be used as a vertex by analyzing the visibility and the reprojection error of the point cloud data so as to construct a tetrahedron; and finally, optimizing the constructed tetrahedron by gathering the characteristics of the first image data and the second image data to obtain a more planar three-dimensional map.

In the embodiment of the application, firstly, the acquired navigation data is used as the image pose of the first image data, so that a first feature map representing the image feature space pose of the first image data can be quickly constructed; then, the first feature map is optimized and updated by adopting second image data with different acquisition modes, so that feature points in the second feature map can be richer; and finally, combining two image data with different acquisition modes with the second characteristic map, so that a three-dimensional map with higher coverage and higher precision can be constructed.

In some embodiments, to implement fast construction of the first feature map, the step S102 may be implemented by the step shown in fig. 2A, where fig. 2A is another implementation flow diagram of the map creation method provided in this embodiment, and the following description is made with reference to fig. 1 and 2A:

step S201, performing feature extraction on the first image data to obtain image feature points and description information of the image feature points.

In some embodiments, by extracting Feature points of a Scale Invariant Feature Transform (SIFT) for each image in the first image data; and obtains the description information of the feature point. The description information includes the position of the feature point in the image (e.g., two-dimensional coordinates in the image), the scale invariants and rotation invariants of the feature point, and the like. The image feature points extracted from the first image data, including the positions of the image feature points in the image, may be represented in the form of two-dimensional coordinates.

Step S202, matching different images in the first image data based on the description information of the image feature points to obtain an image relation library representing the matching relation between different images.

In some embodiments, whether the image feature points in different images are the same feature point may be determined by analyzing whether the description information of the image feature points of different images is consistent, and the similarity between different images may be determined based on the number of feature points with consistent description information. For example, for any two images (image a and image B) in the first image data, the similarity between image a and image B can be determined by analyzing the number of image feature points with consistent description information in images a and B; and if more image characteristic points with consistent description information exist between the image A and the image B, determining that the similarity between the image A and the image B is higher and the degree of closeness is higher. And creating an image relational library based on the similarity between the different images. And (3) regarding the similarity between the image feature points of different images as the correlation between the image feature points, if the image feature points with consistent description information between the two images are more, the connection between the two images can be considered to be stronger, and constructing an image relation library capable of representing the similarity between the different images based on the correlation and the images in the first image data. For example, first, image SIFT features in input first image data are extracted; then, establishing the association between the feature points of different images; finally, a Scene Graph (Scene Graph) serving as an image relational library is constructed by using a bag-of-words model.

Step S203, determining the first feature map based on the image relation library and the navigation data.

In some embodiments, the pose of the acquisition device is determined by first taking the navigation data as the initial image pose of the first image data, in combination with the extrinsic parameters of the navigation device acquiring the navigation data, the extrinsic parameters of the acquisition device acquiring the first image data; then, the two-dimensional coordinates (i.e. feature observation) of the image feature points in the image relation library, the pose of the acquisition device and the internal parameters of the acquisition device are used as the input of the triangulation processing, so that the spatial position of each feature point in a world coordinate system can be determined, and a preliminarily constructed feature map representing the spatial position of the image feature points in the first image data, namely the first feature map, is further realized.

In the embodiment of the application, the similarity between different images can be obtained by matching the description information of the image feature points in different images, so that an image relational library is constructed by adopting a bag-of-words model according to the similarity between different images; furthermore, the coordinates of the feature points in the image relational database can be combined with the navigation data, so that the first feature map representing the spatial position of the image feature points in the image data can be simply and quickly constructed.

In some embodiments, the navigation data is combined with the set parameters of the acquisition device, and the spatial position of the image feature point is determined in a triangularization manner, that is, the step 204 can be implemented by:

step S141, determining a first pose of a first acquisition device of the first image data based on the navigation data and a preset first acquisition parameter.

In some embodiments, the presetting of the first acquisition parameters comprises: acquiring extrinsic parameters of the first image data and acquiring extrinsic parameters of the navigation data. For example, if the device for acquiring the first image data is an on-board camera and the device for acquiring the navigation data is a global satellite inertial navigation system, the external parameters of the on-board camera and the external parameters of the global satellite inertial navigation system are determined. The navigation data is used as the initial image pose of the first image data, and the pose of the vehicle-mounted camera, namely the first pose C, can be solved by combining the external parameters of the vehicle-mounted camera and the external parameters of the global satellite inertial navigation system_i。

Step S142, determining the first feature map based on the image feature points, the first pose and the second acquisition parameters in the relational database.

In some embodiments, the second acquisition parameters include intrinsic parameters for acquiring the first image data. The second acquisition parameter is an intrinsic parameter of the apparatus acquiring the first image data, for example, the apparatus is a vehicle-mounted video camera, and the intrinsic parameter may be a camera intrinsic parameter matrix. And triangularizing the image feature points on the basis of the camera internal reference matrix, the two-dimensional coordinates of the image feature points and the pose of the camera serving as a first pose, and determining the three-dimensional coordinates of the image feature points in a world coordinate system, namely the spatial positions of the image feature points. Therefore, the feature map of the image data can be quickly constructed by combining the feature matching relationship with the initial image pose provided by the navigation data based on the triangulation principle.

In some possible implementations, in order to improve the accuracy of the finally obtained first feature map, the initial feature map is optimally adjusted according to the predicted coordinates of the image feature points and the coordinates of the image feature points in the image, so as to obtain the first feature map, that is, the step S142 may be implemented by:

and step one, triangularizing the first true value position of the image feature point in the image based on the first pose and the second acquisition parameter to obtain the three-dimensional coordinate of the image feature point in a world coordinate system.

In some embodiments, the first true value position of the image feature point in the image, the first pose of the acquisition device of the first image data, and the internal reference matrix of the acquisition device are used as inputs for triangularization of the image feature point to obtain three-dimensional coordinates of the image feature point in a world coordinate system, so that triangularization of each image feature point is realized.

And secondly, constructing the first feature map based on the three-dimensional coordinates.

In some embodiments, after obtaining the three-dimensional coordinates of each image feature point in the image, constructing a feature map of the image feature point, and representing the spatial position of the image feature point by using the feature map; the feature map can be directly used as a first feature map for subsequent map updating; the first feature map can also be obtained by optimizing the feature point coordinates and the image pose of the feature map. In this way, the triangulated feature points are obtained by triangulating the true value positions of the image feature points, and the first feature map can be quickly constructed based on the three-dimensional coordinates of the triangulated feature points.

In some possible implementation manners, after triangularization is performed on image feature points, an initial feature map is preliminarily constructed, and the first feature map is obtained by optimizing and adjusting the initial feature map, so that the accuracy of the first feature map is more accurate, that is, the second step can be implemented by the following processes:

firstly, an initial feature map representing the spatial position of the image feature point is constructed based on the three-dimensional coordinates.

Secondly, a first predicted position of the spatial position projected to the first image data is determined based on the transformation parameters of the navigation data to the camera coordinate system and the second acquisition parameters.

In some embodiments, the conversion parameters of the navigation data to the camera coordinate system include: pose of navigation device of navigation data (including, direction R of navigation device)_iAnd position t_i) A rotation matrix and translation vectors of a navigation device of the navigation data to a camera coordinate system; for example, the navigation device is a global satellite inertial navigation system, and the conversion parameters include: direction R of global satellite inertial navigation system_iAnd position t_iRotation matrix and translation vector (T) of the satellite inertial navigation system to the camera coordinate system_RAnd T_t). The second acquisition parameter is an internal reference matrix K of an acquisition device for acquiring the first image data, and the three-dimensional coordinate X of the jth image characteristic point in the initial characteristic map_j. Based on the image feature point, a first prediction position K (T) of the image feature point projected from the space position to the image is determined_RR_iX_j+T_Rt_i+T_t)。

Again, a first difference between the first true position and the first predicted position is determined.

In some embodiments, the first truth position x of an image feature point in the relational database in the image to which it belongs_ijDetermining the difference value between the first predicted position and the first true position to estimate the reasonability of the space position of the feature point in the initial feature map; and optimizing and adjusting the image feature points with unreasonable spatial positions in the initial feature map based on the difference value, thereby obtaining a first feature map with higher precision. For example, for each feature point in the initial feature map, a first predicted position where the feature point is projected from a spatial position onto the image, that is, a position predicted in the first image data is determined; based on the feature point in the imageA true position, estimating the prediction error of the first predicted position.

And finally, adjusting the spatial position of the feature point in the initial feature map based on the first difference to obtain the first feature map.

In some embodiments, based on the first difference, the three-dimensional coordinates of the image feature points in the initial feature map and the image pose may be adjusted to complete the construction of the feature map based on the vehicle-mounted collected data. In this way, by determining the first predicted position of the feature point in the initial feature map projected from the spatial position onto the image and the first true position of the feature point in the image, the spatial position and the image pose of the feature point in the initial feature map can be optimized, so that the accuracy of the optimized first feature map is higher.

In some embodiments, in order to improve the high coverage of the constructed first feature map, the first feature map is updated by adopting second image data with different sources to obtain a second feature map with higher coverage; that is, the step S103 can be implemented by the steps shown in fig. 2B, and fig. 2B is a schematic flow chart of another implementation of the map building method provided in the embodiment of the present application, and the following description is made with reference to the steps shown in fig. 1 and 2B:

step S221, updating an image relational database based on the matching relationship between the second image data and the first image data to obtain an updated image relational database.

In some embodiments, the manner of updating the image relation library by using the second image data is substantially the same as the implementation manner of step S201 and step S202. Taking the second image data as aerial image data and the first image data as ground image data collected by the vehicle-mounted camera as an example, by performing feature extraction on the second image data, and determining the similarity between the images in the first image data and the second image data according to the extracted feature point description information and the feature point description information in the first image data, the matching relationship between the images is obtained. Based on the matching relationship, the image relationship library established according to the matching relationship between the images in the first image data is updated, so that the updated image relationship library can not only represent the similarity degree of the images in the first image data, but also represent the similarity degree between the first image data and the second image data.

Step S222, determining an image to be registered in the updated image relation library.

In some embodiments, a target feature point corresponding to a three-dimensional point in the first feature map exists in the image to be registered. In the updated image relation library, for each image, it is determined whether the feature points in the image can find the corresponding three-dimensional points in the first feature map. And if the feature points in the image can find corresponding three-dimensional points in the first feature map, determining the image as the image to be registered.

Step S223 is to determine the image pose of the image to be registered based on the target feature point.

In some embodiments, for the target feature points capable of finding corresponding three-dimensional points in the first feature map, an image pose of an image to which the target feature points belong, that is, an image pose of the image to be registered, is determined in a random sampling manner.

And S224, registering the image pose to the first feature map to obtain a registered feature map.

In some embodiments, the image pose determined according to the target feature point is registered in the first feature map, so that the spatial position of the picture content of the image to be registered to which the target feature point belongs can be presented in the registered feature map. So, through combining the first image data and the second image data that the source is different, can effectively utilize the advantage of on-vehicle camera and unmanned aerial vehicle data, promote the coverage and the integrity of building the picture.

In some possible implementation manners, the registration sequence of the images to be registered is determined by counting the number of the target feature points, so that the image poses of the images to be registered are sequentially registered in the first feature map according to the registration sequence, and the method can be implemented by the following steps:

in the first step, the number of target feature points included in each image to be registered is determined.

For each frame of image to be registered, counting the number of target feature points in the image to be registered, wherein the more the target feature points in the image to be registered are, the higher the similarity between the image to be registered and the region to be drawn is, that is, the higher the possibility that the image to be registered is an image collected for the region to be drawn is, and the higher the overlapping degree between the image to be registered and the three-dimensional point in the first feature map is, the image to be registered can be preferentially registered.

And secondly, determining the registration sequence of each image to be registered based on the number.

And determining the registration sequence of the images to be registered according to the number from large to small. And arranging the image to be registered with the maximum target feature quantity at the first position, and preferentially registering the image pose.

And thirdly, registering the image pose of each image to be registered into the first feature map based on the registration sequence to obtain the registered feature map.

For example, the image pose of the image to be registered with the largest number of target features is registered in the first feature map, and then the image pose of the image to be registered with the next number of target features is registered in the first feature map. In this way, the accuracy of the registered image pose can be improved by determining the registration order of the images to be registered according to the number of the target features.

Step S225, based on the other feature points in the image to be registered and the second image data, adjusts the registered feature map to obtain the second feature map.

In some embodiments, the other feature points are feature points other than the target feature point in the image to be registered.

In some possible implementation manners, in order to further optimize the registered feature map, the step S225 further includes the following steps of upsampling other feature points, triangulating one sampled feature point, and optimally adjusting the registered feature map according to a result of the triangularization processing:

and step one, sampling the other characteristic points to obtain sampling characteristic points.

Other feature points in the image to be registered are uniformly sampled, and a small number of feature points are reserved in each sampling area, for example, one sampling feature point is reserved in each sampling area, so that the phenomenon that the image features are too concentrated can be reduced, and the complexity of global optimization is reduced.

And triangularizing the sampling feature points, and determining three-dimensional coordinates of the sampling feature points in a world coordinate system.

Triangularization is carried out on the sampling feature point through the pose of the acquisition device corresponding to the sampling feature point, the internal reference matrix of the acquisition device and the coordinate of the sampling feature point in the image, so that the spatial pose of the sampling feature point, namely the three-dimensional coordinate of the sampling feature point in a world coordinate system, is determined.

And thirdly, adjusting the registered feature map based on the three-dimensional coordinates of the sampling feature points and the second image data to obtain the second feature map.

By combining the spatial position of the sampling feature point with the pose of the acquisition device of the second image data, the difference between the actual position of the sampling feature point and the estimated spatial position can be further predicted, and then the position of the three-dimensional point in the registered feature map can be optimized based on the difference to obtain the second feature map. In some possible implementations, this may be achieved by:

the method comprises the first step of determining a second prediction position of projecting the three-dimensional coordinates of the sampling characteristic points to first image data and a third prediction position of projecting the three-dimensional coordinates of the sampling characteristic points to second image data based on the conversion parameters of the navigation data to a camera coordinate system.

The second predicted position is determined in a similar manner as the first predicted position, by determining a rotation matrix and a translation vector (T) of the navigation data to the camera coordinate system_RAnd T_t) Direction R of the navigation device_iAnd position t_iAn internal reference matrix K of an acquisition device (e.g., a vehicle-mounted camera) for acquiring first image data_CThree-dimensional coordinate X of sampling characteristic point_j(ii) a Based on the above, the second predicted position of the sampling feature point projected from the space position to the first image is determined to be K_C(T_RR_iX_j+T_Rt_i+T_t)。

Internal reference matrix K based on acquisition device (such as aerial unmanned aerial vehicle) for acquiring second image data_ADirection R of the navigation device_iAnd position t_iAnd three-dimensional coordinates X of the sampling feature points_jDetermining a third predicted position K of the sampled feature point projected from the spatial position to the second image_A(R_kX_j+t_k)。

And secondly, determining a second difference value between the second prediction position and a second true value position of the sampling feature point in the image, and determining a third difference value between the third prediction position and the second true value position.

For the sampling feature points obtained in each sampling area, determining a second prediction position of the sampling feature points projected from the space position to the first image data, namely the position predicted in the first image data; and estimating the prediction error of the second prediction position based on the second true value position of the sampling feature point in the image. Similarly, for the sampled feature point, a prediction error at the third prediction position, that is, a third difference value is estimated.

And thirdly, adjusting the space position of the feature point in the registered feature map based on a second difference value and the third difference value to obtain the second feature map.

Based on the second difference, the three-dimensional coordinates of the registered feature map corresponding to the image feature points in the first image data, and the image pose of the first image data, can be adjusted. Based on the third difference, the three-dimensional coordinates of the registered feature map corresponding to the image feature points in the second image data, and the image pose of the second image data, may be adjusted. In this way, a sampling feature point is obtained by uniformly sampling in each sampling area, and the spatial position of the feature point in the first feature map is optimized by the difference between the predicted position of the sampling feature point and the actual position in the image, so that the integrity of the second feature map is higher.

In some embodiments, on the basis of the second feature map, by performing depth recovery, point cloud generation, mesh construction, and the like on the image data, a high-precision three-dimensional map is finally created, that is, the above step S105 may be implemented by:

step S151, performing depth processing on the second image data, the second feature map, and the first image data, and generating point cloud data representing the first image data and the second image data.

In some embodiments, the image pose of the second image is included in the second image data, and the image pose of the first image is included in the first image data; the second feature map comprises a first type of three-dimensional feature point corresponding to the image feature point in the first image data in the second feature map, and a second type of three-dimensional feature point corresponding to the image feature point in the second image data in the second feature map. Performing depth estimation and depth map fusion on the first image pose and the first type of three-dimensional feature points as a group to obtain point cloud data representing first image data; and performing depth estimation and depth map fusion by taking the second image pose and the second three-dimensional feature points as a group to obtain point cloud data representing second image data.

In some possible implementations, the point cloud data may be generated by:

firstly, depth estimation is carried out on the second image data, the second feature map and the first image data to obtain a depth map.

Combining a second image, a second image pose and a second three-dimensional feature point in second image data, and performing depth estimation to obtain a depth map of the second image; and combining the first image, the first image pose and the first type of three-dimensional feature points in the first image data, and performing depth estimation to obtain a depth map of the first image. And taking the depth map of the first image and the depth map of the second image as the depth maps obtained in the step.

And secondly, fusing the depth map to obtain the point cloud data.

Respectively carrying out depth fusion on the depth map of the first image and the depth map of the second image to generate point cloud data corresponding to the first image and point cloud data corresponding to the second image; and performing point cloud integration on the two types of point cloud data to obtain final point cloud data. Therefore, the obtained point cloud data is richer by carrying out depth estimation and depth fusion on the two types of images.

Step S152, based on the point cloud data, an initial grid model representing the connection relation between the point cloud data is constructed.

In some embodiments, the connection relationship between the point cloud data, i.e., whether points are connected in the gateway model. And judging whether the point can be used as a vertex according to point cloud data integrated based on the two types of image data, and visibility and reprojection error of the point so as to construct a tetrahedron, so that a plurality of tetrahedrons can be constructed based on the plurality of vertexes to be used as the initial mesh model.

In some possible implementations, the primary gateway model may be built by:

the method comprises the following steps of firstly, determining the visibility of each point in the point cloud data in the second feature map and the reprojection error of each point.

The visibility of each point in the point cloud data in the second feature map is that whether the point has a corresponding three-dimensional feature point in the second feature map, and if the point has the corresponding three-dimensional feature point in the second feature map, the point is visible; and if the point does not have the corresponding three-dimensional feature point in the second feature map, the point is not visible. In some embodiments, regarding the reprojection error, the observations a and B are a set of feature matching pairs, which are projections of the same spatial point C, as known from the feature matching pairs, a belonging to a different map than B, and B' being a projected point on the coordinate system to which B belongs after a is converted into a first global coordinate in the coordinate system to which B belongs. A projection B' of A has a certain distance with the observed value B, and the distance is the reprojection error. The reprojection error of each point in the point cloud data is projections a and B of the point in the two types of images, for example, a is in the first image data, and B is in the second image data; the reprojection error of the point is the distance between the projection point on the coordinate system to which B belongs and the observation value B after A is converted into the first global coordinate in the coordinate system to which B belongs.

And secondly, determining a target point with visibility and reprojection errors meeting preset conditions in the point cloud data.

And in the point cloud data, determining the visibility of the point as the point which can search the corresponding three-dimensional feature point in the second feature map, and obtaining a plurality of target points when the reprojection error is smaller than a certain threshold value.

And thirdly, taking the target points as vertexes, and connecting the vertexes to obtain the initial mesh model.

Connecting the multiple target points as vertices with any vertex in space to form a tetrahedron, and using the tetrahedron as an initial mesh model. Therefore, the tetrahedron is constructed by the points in the point cloud data according to the visibility and the reprojection error of the points, so that the obtained initial mesh model can fully reflect the connection relation between target points.

Step S153, determining the three-dimensional map based on the initial mesh model.

In some embodiments, since the initial mesh model includes a plurality of tetrahedrons, according to the confidence that each face of the tetrahedron is the surface of the real object, an optimal set of faces is selected from the faces of the tetrahedrons, and the optimal set of faces is connected to form the three-dimensional map. Therefore, the high-precision dense point cloud and the high-precision grid model are finally produced through depth recovery, point cloud generation and grid construction, the high-precision characteristic of vehicle-mounted data can be retained, and the high coverage rate of aerial photography visual angles is combined, so that the precision of the obtained three-dimensional map is higher.

In some possible implementations, the construction of the three-dimensional map may be achieved by:

firstly, in the initial mesh model, determining a collection source of point cloud data corresponding to a plurality of vertexes of each surface.

In a plurality of faces of a plurality of tetrahedrons included in the initial mesh model, the origin of three vertices of each face is determined. Taking the surface as a triangular patch as an example, the acquisition states of three vertexes of the surface are analyzed, and whether the three vertexes come from different vehicle-mounted cameras or the three vertexes all come from the same vehicle-mounted camera or navigation device (such as an aerial photography unmanned aerial vehicle).

And secondly, determining the weight for representing the penetration of each surface by the sight line based on the acquisition source.

If the three vertexes are from different vehicle-mounted cameras, the possibility that the three vertexes of the triangular patch do not belong to the same object is high, namely the possibility that the triangular patch is the surface of a real object is high; the greater the weight of the patch penetrated by the line of sight; wherein the sight line is a line emitted from the acquisition device to the patch.

And thirdly, determining the three-dimensional map based on the surface with the weight smaller than a preset threshold value.

Determining a weight for each surface patch to represent that the surface is penetrated by the sight line, and selecting a group of surfaces with the minimum sum of the weights; and connecting the group of planes to obtain the three-dimensional map. In this way, by analyzing the weight of one surface of the tetrahedron penetrated by the sight line, the surface with smaller weight is selected to construct the three-dimensional map, so that the constructed three maps are smoother.

In the following, an exemplary application of the embodiment of the present application in an actual application scenario will be described, taking constructing a high-precision map based on a global satellite inertial navigation system, a camera and an aerial unmanned aerial vehicle as an example.

With the continuous development of computer and communication technologies, maps provide great help for people going out, and the provided information includes, but is not limited to, road information, building information, traffic information and the like. High-precision maps are concerned with centimeter-level reconstruction accuracy and rich hierarchical information. High-precision maps are mostly used in the field of unmanned driving. In the related art, maps are mostly generated by adopting a conventional acquisition and manual post-processing method. For example, the high-precision professional equipment acquisition and crowdsourcing acquisition methods both require manual alignment and other operations after data acquisition. And the collection of high-precision map data is mostly collected by a collection vehicle in the range of paved roads, the reconstruction precision of non-road areas is limited, the requirement of constructing a full-area high-precision map is difficult to meet, and the integrity and the hierarchical structure of the whole reconstruction are not complete.

Based on this, the embodiment of the application provides a high-precision map construction method for the whole-area coverage of the air-ground non-homologous data by combining the advantages of comprehensive coverage and convenience in acquisition of the aerial photography unmanned aerial vehicle on the basis of ground acquisition. In the embodiment of the application, the method and the acquisition system for constructing the high-precision map can be automatically completed by using the vehicle-mounted global satellite inertial navigation system, the camera and the aerial photography unmanned aerial vehicle.

Fig. 3 is a schematic flow chart of an implementation process of the map building method provided in the embodiment of the present application, and as can be seen from fig. 3, the method can be implemented through the following steps:

step S301, acquiring pose data of the vehicle-mounted global satellite inertial navigation system and acquiring image data of the vehicle-mounted camera.

And calibrating and synchronizing the pose data of the vehicle-mounted global satellite inertial navigation system and the image data of the vehicle-mounted camera. The pose data of the vehicle-mounted global satellite inertial navigation system can be used as the navigation data in the above embodiment.

And S302, constructing a characteristic map of the vehicle-mounted collected data based on the pose data and the camera image.

In some embodiments, the feature map of the on-board collected data may be the first feature map in the above embodiments. The feature map of the ground data is constructed based on the triangulation principle by combining the feature matching relationship between the images acquired by the vehicle-mounted camera with the pose provided by the global satellite inertial navigation system. Mainly comprises the following 3 processes: extracting and matching features; triangularizing the feature points; and (5) adjusting and optimizing. As shown in fig. 4, fig. 4 is a schematic view of an implementation flow of initially constructing a feature map according to an embodiment of the present application, and the following description is performed in conjunction with the steps shown in fig. 4:

in step S401, feature extraction and matching are performed on the vehicle-mounted camera image data 41.

In some embodiments, the number of matches of Scale Invariant Feature Transform (SIFT) Feature points is used to measure how closely the image is associated. Firstly, extracting SIFT features of an input image; then, establishing characteristic point association between the images; finally, a Scene Graph (Scene Graph) is constructed by using a bag-of-words model.

And S402, triangulating the initial characteristic map based on the pose data 42 of the global satellite inertial navigation system and the external parameters of the camera.

Solving the camera C by combining the pose data acquired by the global satellite inertial navigation system with the external parameters of the vehicle-mounted global satellite inertial navigation system and the camera_iPosition and posture (R)_iAnd t_i) Based on this, feature points are triangulated. And initially constructing an initial characteristic map of the vehicle-mounted data.

And step S403, optimizing and adjusting the triangulated initial feature map.

The process of adjusting the feature point coordinates and the image pose of the initial feature map is as shown in formula (1):

wherein n is the number of images, m is the number of map points, K is the camera internal reference matrix, T_RAnd T_tRespectively a rotation matrix and a translation vector from a satellite inertial navigation system to a camera coordinate system, X is a three-dimensional coordinate of a map point, X is characteristic observation,

therefore, the feature point coordinates and the image poses of the updated feature map are optimized, and a vehicle-mounted collected data-based feature map construction link is realized.

Step S303, updating the feature map based on the unmanned aerial vehicle aerial photography data 31.

In some embodiments, feature points of images obtained by aerial photography of the unmanned aerial vehicle are extracted and matched with image data of the vehicle-mounted camera, a scene graph and a feature map are updated, and the process mainly comprises two parts, namely aerial photography image feature extraction and matching and feature map updating.

A first part: similar to step S401, the aerial image features obtained by aerial photography by the unmanned aerial vehicle are extracted, the image features are matched with the features of the vehicle-mounted camera image, and the original scene graph is updated.

By combining the new scene graph with the existing feature map, the feature map is updated by using the incremental motion restoration structure method, which can be implemented by the steps shown in fig. 5:

step S501, a first feature map created based on the vehicle-mounted collected image and an updated scene map are obtained.

Step S502, selecting a frame image to be registered in the updated scene image.

The frame to be registered is selected mainly according to the number of the visual image points. The number of the visual map points reflects the similarity degree of the observation area, and when most of the feature points in the newly registered image can find corresponding three-dimensional points in the current feature map, the observation overlapping degree is high, and registration can be performed preferentially.

In step S503, other feature points in the image to be registered are triangulated.

And for the image to be registered, solving the image pose by adopting pnp + ransac through the feature points observed by the map, and registering the image to the feature map. And then, uniformly sampling the feature points without map observation, and only reserving one feature point in each sampling area. And solving the spatial position of the characteristic point by triangulation. On the basis of ensuring the complete coverage of the feature map, the image features are prevented from being too concentrated, and the complexity of global optimization is reduced.

And step S504, performing local/global optimization adjustment on the first feature map based on the triangulated feature points to obtain a second feature map.

The process of optimizing and adjusting the three-dimensional feature points and the image poses in the first feature map is as shown in formula (2):

wherein O is the number of aerial images, K_AFor parameters within the aerial image, R_kAnd t_kFor corresponding rotation matrix and translation vector of image, v_kjAnd v_ijEach represents the visibility of a feature point j in the corresponding image, and the meanings of the remaining parameters are in accordance with formula (1).

Step S304, generating a grid model based on the high-precision dense point cloud fused by the sensors.

On the basis of the existing feature map, a high-precision dense point cloud and a grid model are finally produced through depth recovery, point cloud generation and grid construction. On the basis of multi-view geometry, vehicle-mounted acquisition data and unmanned aerial vehicle aerial data are processed in a combined mode, and the characteristics of high precision of the vehicle-mounted data and the advantage of high aerial visual angle coverage rate are reserved. The implementation process is as shown in fig. 6, and two processes of depth estimation, depth map fusion, point cloud combination to generate dense point cloud and common construction of a grid model are independently completed on vehicle-mounted camera image data 61, a second feature map, an image pose 62 and unmanned aerial vehicle aerial image data 63.

Step S601, carrying out depth estimation on the vehicle-mounted camera image data 61, the image pose of the data and the corresponding three-dimensional feature points of the data in the second feature map to obtain a depth map.

Step S602, depth map fusion is performed on the depth map obtained in step S601, and a point cloud of the vehicle-mounted camera image data 61 is obtained.

And step S603, carrying out depth estimation on the unmanned aerial vehicle aerial image data 63, the image pose of the data and the corresponding three-dimensional feature point of the image in the second feature map to obtain a depth map.

And step S604, performing depth map fusion on the depth map obtained in the step S603 to obtain a point cloud of the unmanned aerial vehicle aerial image 63.

And independently estimating the vehicle-mounted image depth map and the image depth map acquired by the unmanned aerial vehicle by using the image and the corresponding pose by using a depth estimation method. And fusing the obtained depth map into point cloud to generate point cloud of vehicle-mounted collected data and point cloud of unmanned aerial vehicle aerial data. And integrating the point clouds obtained in the step S602 and the step S604 to obtain point cloud data.

And step S605, integrating the point cloud acquired by the integrated vehicle-mounted camera with the point cloud acquired by the unmanned aerial vehicle aerial photography to obtain point cloud data.

The point cloud data is a dense point cloud.

Step S606, a mesh model is generated based on the point cloud data.

And step S607, optimizing and adjusting the grid model according to the set weight to obtain the three-dimensional map.

And a point cloud inserting link integrates the point cloud acquired by the integrated vehicle-mounted camera and the point cloud acquired by the aerial photography of the unmanned aerial vehicle. On the basis, whether the point is used as a vertex or not is judged according to two indexes of visibility and reprojection error of the point, and a tetrahedron is constructed. When a graph Cut algorithm (Grapg-Cut) is used for extracting a triangular patch from a tetrahedron set, a weight term is added by combining the characteristics of unmanned aerial vehicle aerial photography and vehicle-mounted camera data, as shown in formula (3):

wherein b < r, so that the finally generated three-dimensional map is smoother.

In the embodiment of the application, firstly, the pose acquired by the global satellite inertial navigation system is used as an initial value, and a feature map is quickly constructed by combining with the final adjustment and optimization, so that the iterative registration and optimization links are simplified, and the inertial navigation information is fully utilized to improve the reconstruction efficiency. Secondly, combining the advantages of different cameras, fully utilizing the advantages of high coverage of aerial images and full observation of vehicle-mounted data, and constructing high-precision and high-coverage dense point cloud and grid models; therefore, the coverage and the integrity of the constructed map can be improved, and a more complete three-dimensional model is finally generated.

An embodiment of the present application provides a map building apparatus, fig. 7 is a schematic structural component diagram of the map building apparatus provided in the embodiment of the present application, and as shown in fig. 7, the map building apparatus 700 includes:

a first obtaining module 701, configured to obtain first image data of a region to be drawn and navigation data corresponding to the first image data;

a first determining module 702 for determining a first feature map based on the first image data and the navigation data;

a first updating module 703, configured to update the first feature map based on the second image data of the area to be drawn, to obtain a second feature map; the first image data and the second image data are acquired in different modes;

a first constructing module 704, configured to construct a three-dimensional map of the area to be drawn based on the second image data, the second feature map, and the first image data.

In some embodiments, the first determining module 702 includes:

the first extraction submodule is used for extracting the characteristics of the first image data to obtain image characteristic points and description information of the image characteristic points;

the first matching submodule is used for matching different images in the first image data based on the description information of the image feature points to obtain an image relation library representing the matching relation between the different images;

a first determination sub-module to determine the first feature map based on the image relational library and the navigation data.

In some embodiments, the first determining sub-module includes:

the first determining unit is used for determining a first pose of the acquisition device of the first image data based on the navigation data and preset first acquisition parameters; wherein, the presetting of the first acquisition parameter comprises: acquiring external parameters of the first image data and acquiring external parameters of the navigation data;

the second determining unit is used for determining the first feature map based on the image feature points in the relational database, the first pose and the second acquisition parameters; wherein the second acquisition parameters include intrinsic parameters for acquiring the first image data.

In some embodiments, the second determining unit includes:

the first determining subunit is configured to triangulate, based on the first pose and the second acquisition parameter, a first true value position of the image feature point in the image to which the image feature point belongs, and obtain a three-dimensional coordinate of the image feature point in a world coordinate system;

a first constructing subunit, configured to construct the first feature map based on the three-dimensional coordinates.

In some embodiments, the first building subunit is further configured to: constructing an initial feature map representing the spatial position of the image feature points based on the three-dimensional coordinates; determining a first predicted position of the spatial position projected to the first image data based on the transformation parameter of the navigation data to the camera coordinate system and the second acquisition parameter; determining a first difference between the first true position and the first predicted position; and adjusting the spatial position of the feature point in the initial feature map based on the first difference to obtain the first feature map.

In some embodiments, the first updating module 703 includes:

the first updating submodule is used for updating an image relational database based on the matching relationship between the second image data and the first image data to obtain an updated image relational database;

the second determining submodule is used for determining an image to be registered in the updated image relational database; target feature points corresponding to the three-dimensional points in the first feature map exist in the image to be registered;

the third determining submodule is used for determining the image pose of the image to be registered based on the target feature point;

the first registration submodule is used for registering the image pose to the first feature map to obtain a registered feature map;

a first adjusting submodule, configured to adjust the registered feature map based on other feature points in the image to be registered and the second image data, to obtain the second feature map; and the other feature points are feature points except the target feature points in the image to be registered.

In some embodiments, the first registration sub-module further includes:

the third determining unit is used for determining the number of the target characteristic points included in each image to be registered;

a fourth determining unit configured to determine a registration order of each of the images to be registered based on the number;

and the first registration unit is used for registering the image pose of each image to be registered into the first feature map based on the registration sequence to obtain the registered feature map.

In some embodiments, the first adjustment submodule includes:

the first sampling unit is used for sampling the other characteristic points to obtain sampling characteristic points;

the fifth determining unit is used for triangularizing the sampling characteristic points and determining three-dimensional coordinates of the sampling characteristic points in a world coordinate system;

and the first adjusting unit is used for adjusting the registered feature map based on the three-dimensional coordinates of the sampling feature points and the second image data to obtain the second feature map.

In some embodiments, the first adjusting unit includes:

a second determining subunit, configured to determine, based on a conversion parameter of the navigation data to a camera coordinate system, a second predicted position at which the three-dimensional coordinates of the sampling feature points are projected to the first image data and a third predicted position at which the three-dimensional coordinates of the sampling feature points are projected to the second image data;

a third determining subunit, configured to determine a second difference value between the second prediction position and a second true value position of the sampling feature point in the image, and a third difference value between the third prediction position and the second true value position;

and the first adjusting subunit is configured to adjust the spatial position of the feature point in the registered feature map based on a second difference and the third difference, so as to obtain the second feature map.

In some embodiments, the first building block 704 includes:

the first processing submodule is used for performing depth processing on the second image data, the second feature map and the first image data to generate point cloud data representing the first image data and the second image data;

the first construction submodule is used for constructing an initial grid model representing the connection relation between the point cloud data based on the point cloud data;

a second determining submodule for determining the three-dimensional map based on the initial mesh model.

In some embodiments, the first processing sub-module comprises:

a first estimating unit, configured to perform depth estimation on the second image data, the second feature map, and the first image data to obtain a depth map;

and the first fusion unit is used for fusing the depth map to obtain the point cloud data.

In some embodiments, the first building submodule comprises:

a sixth determining unit, configured to determine visibility of each point in the point cloud data in the second feature map and a reprojection error of each point;

a seventh determining unit, configured to determine, in the point cloud data, that a visibility and a reprojection error satisfy a preset condition target point;

and the first connecting unit is used for taking the target points as vertexes and connecting the vertexes to obtain the initial mesh model.

In some embodiments, the second determining sub-module includes:

an eighth determining unit, configured to determine, in the initial mesh model, acquisition sources of point cloud data corresponding to a plurality of vertices of each surface;

a ninth determining unit, configured to determine, based on the acquisition source, a weight characterizing each of the faces penetrated by a line of sight;

a tenth determination unit configured to determine the three-dimensional map based on a face whose weight is smaller than a preset threshold.

It should be noted that the above description of the embodiment of the apparatus, similar to the above description of the embodiment of the method, has similar beneficial effects as the embodiment of the method. For technical details not disclosed in the embodiments of the apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.

It should be noted that, in the embodiment of the present application, if the map building method is implemented in the form of a software functional module and sold or used as a standalone product, the map building method may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially implemented in the form of a software product, which is stored in a storage medium and includes several instructions to enable an electronic device (which may be a terminal, a server, etc.) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a hard disk drive, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.

Correspondingly, the embodiment of the present application further provides a computer program product, where the computer program product includes computer-executable instructions, and after the computer-executable instructions are executed, the steps in the map building method provided by the embodiment of the present application can be implemented.

Accordingly, an embodiment of the present application further provides a computer storage medium, where computer-executable instructions are stored on the computer storage medium, and when executed by a processor, the computer-executable instructions implement the steps of the map building method provided in the foregoing embodiment.

Accordingly, an electronic device is provided in an embodiment of the present application, fig. 8 is a schematic view of a composition structure of the electronic device provided in the embodiment of the present application, and as shown in fig. 8, the electronic device 800 includes: a processor 801, at least one communication bus, a communication interface 802, at least one external communication interface, and a memory 803. Wherein the communication interface 802 is configured to enable connected communication between these components. The communication interface 802 may include a display screen, and the external communication interface may include a standard wired interface and a wireless interface. The processor 801 is configured to execute an image processing program in the memory to implement the steps of the map building method provided in the above embodiments.

The above descriptions of the map building apparatus, the electronic device, and the storage medium embodiments are similar to the above descriptions of the method embodiments, and have similar technical descriptions and advantages to the corresponding method embodiments, which are limited by the space. For technical details not disclosed in the embodiments of the mapping apparatus, the electronic device and the storage medium of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application. The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit. Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Read Only Memory (ROM), a magnetic disk, or an optical disk.

Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof that contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a magnetic or optical disk, or other various media that can store program code. The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A map construction method, characterized in that the method comprises:

2. The method of claim 1, wherein determining a first feature map based on the first image data and the navigation data comprises:

performing feature extraction on the first image data to obtain image feature points and description information of the image feature points;

matching different images in the first image data based on the description information of the image feature points to obtain an image relation library representing the matching relation between the different images;

determining the first feature map based on the image relational library and the navigation data.

3. The method of claim 2, wherein determining the first feature map based on the image relational library and the navigation data comprises:

determining a first pose of an acquisition device of the first image data based on the navigation data and a preset first acquisition parameter; wherein, the presetting of the first acquisition parameter comprises: acquiring external parameters of the first image data and acquiring external parameters of the navigation data;

determining the first feature map based on image feature points, the first pose and second acquisition parameters in the relational database; wherein the second acquisition parameters include intrinsic parameters for acquiring the first image data.

4. The method of claim 3, wherein determining the first feature map based on image feature points in the relational database, the first pose, and second acquisition parameters comprises:

triangularizing the first true value position of the image feature point in the image to obtain a three-dimensional coordinate of the image feature point in a world coordinate system based on the first pose and the second acquisition parameter;

and constructing the first feature map based on the three-dimensional coordinates.

5. The method of claim 4, wherein said constructing the first feature map based on the three-dimensional coordinates comprises:

constructing an initial feature map representing the spatial position of the image feature points based on the three-dimensional coordinates;

determining a first predicted position of the spatial position projected to the first image data based on the transformation parameter of the navigation data to the camera coordinate system and the second acquisition parameter;

determining a first difference between the first true position and the first predicted position;

and adjusting the spatial position of the feature point in the initial feature map based on the first difference to obtain the first feature map.

6. The method according to any one of claims 1 to 5, wherein the updating the first feature map based on the second image data of the area to be drawn to obtain a second feature map comprises:

updating an image relational database based on the matching relationship between the second image data and the first image data to obtain an updated image relational database;

determining an image to be registered in the updated image relational database; target feature points corresponding to the three-dimensional points in the first feature map exist in the image to be registered;

determining the image pose of the image to be registered based on the target feature points;

registering the image pose to the first feature map to obtain a registered feature map;

adjusting the registered feature map based on other feature points in the image to be registered and the second image data to obtain the second feature map; and the other feature points are feature points except the target feature points in the image to be registered.

7. The method of claim 6, wherein registering the image pose to the first feature map results in a registered feature map, further comprising:

determining the number of target feature points included in each image to be registered;

determining a registration sequence of each image to be registered based on the number;

and registering the image pose of each image to be registered into the first feature map based on the registration sequence to obtain the registered feature map.

8. The method according to claim 6 or 7, wherein the adjusting the registered feature map based on the other feature points in the image to be registered and the second image data to obtain the second feature map comprises:

sampling the other characteristic points to obtain sampling characteristic points;

triangularization is carried out on the sampling characteristic points, and three-dimensional coordinates of the sampling characteristic points in a world coordinate system are determined;

and adjusting the registered feature map based on the three-dimensional coordinates of the sampling feature points and the second image data to obtain the second feature map.

9. The method of claim 8, wherein the adjusting the registered feature map based on the three-dimensional coordinates of the sampled feature points and the second image data to obtain the second feature map comprises:

determining a second prediction position of projecting the three-dimensional coordinates of the sampling characteristic points to first image data and a third prediction position of projecting the three-dimensional coordinates of the sampling characteristic points to second image data based on the conversion parameters of the navigation data to a camera coordinate system;

determining a second difference value between the second predicted position and a second true position of the sampling feature point in the image, and a third difference value between the third predicted position and the second true position;

and adjusting the spatial position of the feature point in the registered feature map based on a second difference value and the third difference value to obtain the second feature map.

10. The method according to any one of claims 1 to 9, wherein constructing the three-dimensional map of the area to be rendered based on the second image data, the second feature map, and the first image data comprises:

performing depth processing on the second image data, the second feature map and the first image data to generate point cloud data representing the first image data and the second image data;

constructing an initial grid model representing a connection relation between the point cloud data based on the point cloud data;

determining the three-dimensional map based on the initial mesh model.

11. The method of claim 10, wherein the depth processing the second image data, the second feature map, and the first image data to generate point cloud data characterizing the first image data and the second image data comprises:

performing depth estimation on the second image data, the second feature map and the first image data to obtain a depth map;

and fusing the depth map to obtain the point cloud data.

12. The method of claim 10 or 11, wherein constructing an initial mesh model characterizing connection relationships between the point cloud data based on the point cloud data comprises:

determining the visibility of each point in the point cloud data in the second feature map and the reprojection error of each point;

determining a target point with visibility and a reprojection error meeting preset conditions in the point cloud data;

and taking the target points as vertexes, and connecting the vertexes to obtain the initial mesh model.

13. The method of any of claims 10 to 12, wherein determining the three-dimensional map based on the initial mesh model comprises:

determining the acquisition sources of point cloud data corresponding to a plurality of vertexes of each surface in the initial mesh model;

determining, based on the acquisition sources, weights characterizing penetration of each of the faces by a line of sight;

and determining the three-dimensional map based on the surface with the weight smaller than a preset threshold value.

14. Method according to any one of claims 1 to 13, characterized in that the acquisition means of the first image data comprise an on-board camera and/or in that the acquisition means of the second image data are aerial devices.

15. A map building apparatus, characterized in that the apparatus comprises:

16. A computer storage medium having computer-executable instructions stored thereon that, when executed, perform the method steps of any of claims 1 to 14.

17. An electronic device, comprising a memory having computer-executable instructions stored thereon and a processor capable of performing the method steps of any one of claims 1 to 14 when executing the computer-executable instructions on the memory.