CN118279515A

CN118279515A - Real-time multi-terminal map fusion method and device

Info

Publication number: CN118279515A
Application number: CN202410713366.7A
Authority: CN
Inventors: 胡劲文; 高琛淇; 雷毅飞; 张德腾; 徐钊; 韩军伟
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2024-06-04
Filing date: 2024-06-04
Publication date: 2024-07-02

Abstract

The invention discloses a real-time multi-terminal map fusion method and a device, which are used for acquiring two different dense sub-maps; determining an overlap region between two different dense sub-maps using an AABB algorithm; searching matching point pairs in the overlapping area, and screening the matching point pairs based on normal vectors of the matching point pairs; using the screened matching point pairs as input information, and fusing two different dense sub-maps by adopting a GICP registration method to generate a global map; according to the method, the overlapping area detection is carried out on different dense sub-maps, the matching point pairs are screened out according to the overlapping area, and finally, the different dense sub-maps are fused based on the matching point pairs, so that the full coverage search on the dense sub-maps can be avoided, the map fusion calculation amount is reduced, and the instantaneity is improved.

Description

Real-time multi-terminal map fusion method and device

Technical Field

The invention belongs to the technical field of map fusion, and particularly relates to a real-time multi-terminal map fusion method and device.

Background

Unmanned aerial vehicles often have to complete flight tasks in complex situations where the global positioning system (Global Positioning System, GPS) fails, in which case the unmanned aerial vehicle needs to rely on sensors carried by itself for positioning navigation and environmental awareness, and therefore, the simultaneous positioning and mapping method is of paramount importance.

Along with the gradual maturation of single unmanned aerial vehicle intelligence level, how many unmanned aerial vehicles cooperate and perception gradually becomes an important research field. The single machine has the problems of small coverage area, poor robustness, limited observation capability and the like when executing tasks. The multi-unmanned aerial vehicle collaborative execution task not only can enhance the environment perception capability, but also can reduce the cost of war loss and improve the space efficiency.

The existing three-dimensional reconstruction method repeatedly optimizes and adjusts a plurality of acquired images or point clouds, and is not suitable for being applied to unmanned aerial vehicles because the high-precision model close to the appearance of a target can be obtained, but the time is very long, the greater the reconstruction range is, the higher the calculation complexity is, and the real-time performance is not guaranteed.

Disclosure of Invention

The invention aims to provide a real-time multi-terminal map fusion method and device, which can reduce the calculated amount of map fusion by searching and positioning the overlapping area of dense sub-maps of multiple terminals, thereby improving the real-time performance.

The invention adopts the following technical scheme: a real-time multi-terminal map fusion method comprises the following steps:

acquiring two different dense sub-maps;

Determining an overlap region between two different dense sub-maps using an AABB algorithm;

searching matching point pairs in the overlapping area, and screening the matching point pairs based on normal vectors of the matching point pairs; the matching point pairs consist of two characteristic points at corresponding positions in different dense sub-maps;

and taking the screened matching point pairs as input information, and fusing two different dense sub-maps by adopting a GICP registration method to generate a global map.

Further, searching for matching point pairs in the overlapping region includes:

The feature points in one overlapping region are inserted into the octree data structure and the pairing points of the feature points in the other overlapping region are found from the octree data structure.

Further, screening the matching point pairs based on normal vectors of the matching point pairs includes:

Calculating normal vectors of two characteristic points in the matching point pair by adopting a principal component analysis method;

calculating an included angle of normal vectors of the two feature points;

And deleting the corresponding matching point pair when the included angle of the normal vectors of the two feature points is larger than a preset included angle threshold value.

Further, the steps of obtaining two different dense sub-maps include:

acquiring feature information of the dense sub-map, and continuing to execute when the feature information meets a preset condition; the feature information comprises the accumulated number of key frames, the length of the pose map, the accumulated time and the number of feature points of the dense sub-map.

Further, the method for constructing the dense sub map comprises the following steps:

Acquiring a point cloud frame with a pose; the point cloud frame comprises point clouds;

selecting a corresponding point cloud frame as a key frame according to the pose relative variation of the point cloud frame;

And establishing a dense sub-map according to the key frames.

Further, building a dense sub-map from the key frames includes:

Performing filtering pretreatment on the point cloud of the key frame;

Determining whether the position of the key frame reaches a historical position based on the filtered and preprocessed point cloud;

And when the position of the key frame does not reach the historical position, carrying out pose registration according to the point cloud after the filtering pretreatment, and generating a dense sub-map.

Further, after pose registration according to the filtered and preprocessed point cloud and before generating the dense sub-map, the method further comprises:

voxel filtering is performed on the point cloud after pose registration.

Further, voxel filtering the point cloud after pose registration includes:

The centroid point of the small cube in the voxel filtering method is used for representing the characteristic point in the small cube.

Further, when the position of the key frame reaches the history position:

the point cloud after the filtering pretreatment in the key frame is designed into a KD-tree data structure;

searching for a matching point pair based on the KD-tree data structure and the point cloud in the key frame corresponding to the history position;

Calculating the theoretical pose change quantity between the key frame and the key frame corresponding to the historical position by using a GICP method by taking the matching point pair as input;

and updating the pose corresponding to the key frame according to the theoretical pose change quantity.

Another technical scheme of the invention is as follows: a real-time multi-terminal map fusion apparatus comprises a memory, a processor and a computer program stored in the memory and executable on the processor to implement a method as described above when the computer program is executed by the computer program processor.

The beneficial effects of the invention are as follows: according to the method, the overlapping area detection is carried out on different dense sub-maps, the matching point pairs are screened out according to the overlapping area, and finally, the different dense sub-maps are fused based on the matching point pairs, so that the full coverage search on the dense sub-maps can be avoided, the map fusion calculation amount is reduced, and the instantaneity is improved.

Drawings

FIG. 1 is a schematic diagram of a real-time multi-terminal map fusion method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a dense sub-map building portion in an embodiment of the invention;

Fig. 3 is a schematic diagram of a dense sub-map fusion portion in an embodiment of the invention.

Detailed Description

The invention will be described in detail below with reference to the drawings and the detailed description.

With the progress of technology, artificial intelligence technology is continuously developed, and various advanced robots are continuously developed to replace people to efficiently complete specific tasks. The Unmanned aerial vehicle (Unmanned AERIAL VEHICLE, UAV) is used as an air aircraft, has small volume, high flexibility and weak space constraint, can be used for performing operations such as air reconnaissance striking, rescue searching, situation awareness, mapping modeling and the like, and has wide market and application in military and civil fields.

When the unmanned aerial vehicle completes the reconnaissance sensing task in an unknown environment, the unmanned aerial vehicle needs to have the capability of constructing a three-dimensional dense map besides autonomous positioning. The sparse map formed by the three-dimensional road points is established during positioning, the real environment cannot be visually presented, and the problem that the three-dimensional dense map of a single unmanned aerial vehicle and the sub map of a plurality of unmanned aerial vehicles are fused can be solved by performing reconnaissance perception on the multi-unmanned aerial vehicle platform. The existing real-time dense map building method can ensure the real-time performance of map building, but because of composition errors and sensor errors, more map points can be generated, and local areas have double images. Moreover, the existing method does not complete the construction and fusion of the three-dimensional dense point cloud map based on the multiple unmanned aerial vehicle platforms.

Based on the above situation, the invention combines modern perception technology and map fusion algorithm from the aspect of improving the stability and map construction precision of multi-unmanned aerial vehicle collaborative perception capability, takes a small four-rotor multi-unmanned aerial vehicle as a specific research object, and aims at the tasks of performing aerial reconnaissance, rescue search, situation perception, mapping modeling and the like on the multi-unmanned aerial vehicle, so as to study multi-unmanned aerial vehicle collaborative perception theory and application technology, and focus on solving the problems of dense map construction and map fusion in multi-unmanned aerial vehicle collaborative perception, thereby striving to break through and innovate on the theory and application of collaborative perception research. The multi-terminal map refers to a dense sub-map of a plurality of unmanned aerial vehicles, namely, the terminal in the invention can be a sensor terminal arranged on a plurality of different unmanned aerial vehicles.

The invention discloses a multi-unmanned aerial vehicle dense map building and map fusion, and aims to improve the overall performance of unmanned aerial vehicle formation through multi-unmanned aerial vehicle collaborative perception environment and high-precision map fusion. Compared with Shan Moren machine perception, the multi-unmanned-plane cooperative perception realizes the functions of enhancing environment perception, improving positioning precision, enhancing autonomy and the like, thereby expanding the application range of unmanned-plane technology, reducing the risk of task execution and providing a more efficient and safer solution for wide military and civil application scenes.

According to the collaborative perception task and the four-rotor unmanned aerial vehicle platform factors, the invention designs a visual three-dimensional dense map construction method, which screens a sparse image key frame to obtain point clouds with pose, carries out filtering pretreatment on the point clouds, registers the point clouds according to the pose of the point clouds, completes the construction of a dense point cloud map by matching with a voxel filtering method, and simultaneously uses GICP based on KD-Tree nearest neighbor search to update the dense point cloud map when loop is detected, thereby reducing accumulated errors. Then, a three-dimensional dense map fusion method based on the overlapping area is provided, fusion judgment of the growing sub-map is carried out by the method, when fusion is judged to be needed, the overlapping area of the sub-map is found by utilizing an AABB bounding box, a octree structure is constructed, matching point pairs of the overlapping area are found based on voxels, and the normal vector included angles of the matching points are calculated to remove the error point pairs, so that sub-map fusion is completed.

The invention discloses a real-time multi-terminal map fusion method, which comprises the following steps: acquiring two different dense sub-maps; determining an overlap region between two different dense sub-maps using an AABB algorithm; searching matching point pairs in the overlapping area, and screening the matching point pairs based on normal vectors of the matching point pairs; the matching point pairs consist of two characteristic points at corresponding positions in different dense sub-maps; and taking the screened matching point pairs as input information, and fusing two different dense sub-maps by adopting a GICP registration method to generate a global map.

According to the method, the overlapping area detection is carried out on different dense sub-maps, the matching point pairs are screened out according to the overlapping area, and finally, the different dense sub-maps are fused based on the matching point pairs, so that the full coverage search on the dense sub-maps can be avoided, the map fusion calculation amount is reduced, and the instantaneity is improved.

More specifically, as shown in fig. 1, UAV1,2, & gt, n represents 1 st to n th unmanned aerial vehicles (i.e., UAVs), input information of each unmanned aerial vehicle includes a depth map, an optimized pose and a color map, in each unmanned aerial vehicle, after receiving the information, a dense mapping step is performed, and in the step, screening key frames to obtain point clouds, filtering the point clouds, and registering the point clouds to form a dense mapping is sequentially included, so as to finally obtain a dense sub map of each unmanned aerial vehicle. And then, carrying out a multi-machine map fusion step according to the dense sub-map of each unmanned aerial vehicle, wherein the step sequentially comprises the steps of acquiring an overlapping area, competing for matching point pairs and fusing the three-dimensional dense map, and finally obtaining the global map of the multi-unmanned aerial vehicle.

Specifically, in the three-dimensional dense map construction section, as shown in fig. 2, first, screening of sparse key frames is performed based on a policy. And (5) aligning the color map and the depth map, and combining the pose to obtain the point cloud frame with the pose. Then, the input point cloud frame is subjected to filtering pretreatment to inhibit noise through direct filtering, radius filtering and outlier filtering in sequence. And then, judging whether the point cloud frame reaches the vicinity of the historical position by using loop detection and similarity verification, and if so, carrying out point cloud registration to update the three-dimensional dense map by using a GICP method based on KD-Tree. If not, registering the point cloud pose into the three-dimensional dense map, carrying out voxel filtering after registering the point cloud and updating the map, limiting the size of the map, and outputting the three-dimensional dense map.

In one embodiment, the method for constructing the dense sub-map includes: acquiring a point cloud frame with a pose, wherein the point cloud frame comprises point clouds, namely a set of points formed by a series of characteristic points; selecting a corresponding point cloud frame as a key frame according to the pose relative variation of the point cloud frame; and establishing a dense sub-map according to the key frames.

In the sparse key frame screening and point cloud acquisition part, as the pose transformation of continuous point cloud frames is not large, namely the observation change is not large, if each frame of point cloud frame is registered, the efficiency is low, the calculation amount is large, and the transmission load of the unmanned aerial vehicle is also a test. Therefore, the method adopts the screened sparse image key frames as the input of the three-dimensional dense map construction.

The criteria for screening the key frames of the image are relatively obvious and uniform observed changes. Suppose the kth point cloud framePose of (2) isThe pose can be obtained through a visual positioning algorithm. The last selected key frame is the reference framePose isCalculating the relative transformation amount of the pose between two point cloud frames:

（1）

the relative conversion amount is decomposed into a rotation conversion amount and a translation conversion amount, and can be obtained:

（2）

Wherein the rotation conversion amount ，The amount of change in the roll angle is indicated,The amount of change in pitch angle is indicated,Representing the yaw angle variation; translation conversion amount，Expressed in world coordinate systemThe amount of translational change in the axial direction,Expressed in world coordinate systemThe amount of translational change in the axial direction,Expressed in world coordinate systemAn amount of translational change in the axial direction; Indicating a combination of rotation and translation.

Next, define:

（3）

Wherein, AndIs a different weight vector that is used to determine the weight of the object,In particular to，A weight factor representing the amount of change in the roll angle,A weight factor representing the amount of change in pitch angle,A weight factor representing the amount of change in yaw angle,In particular to，Representation ofIs used as a weight factor of (1),Representation ofIs used as a weight factor of (1),Representation ofWeight factor of (2), thenAndThe corrected rotation conversion amount and the corrected translation conversion amount are respectively.

Given a screening threshold for sparse key framesAnd the corrected pose is converted relativelyAndComparing, whereinIf (if)Greater than the screening thresholdThenDefining a key frame as a visual dense map input, taking the key frame as a new reference frame, and taking the pose of the key frame as the pose of the new reference frame; otherwise, the point cloud frame is not a key frame. The mathematical formula is expressed as follows:

（4）

the RGB-D depth camera obtains the depth of the pixel point through an active measurement mode. The RGB three-channel color map and the gray scale map aligned with the depth data are acquired, and the method comprises the following steps:

（5）

wherein Z is the depth of the pixel, Is the homogeneous coordinates of the pixel,Is an internal reference matrix of the camera, and is point cloud under a camera coordinate systemThe coordinates of (c) can be expressed as:

（6）

Wherein, Representing points in a point cloud in a camera coordinate systemThe coordinates of the axes are used to determine,Representing the coordinates of the pixel corresponding to the point in the pixel coordinate system on the horizontal axis,Representing the point in the camera coordinate systemThe coordinate value of the axis,Representing the coordinates of the pixel corresponding to the point in the pixel coordinate system on the vertical axis,Representing the point in the camera coordinate systemThe coordinate value of the axis,Is thatThe focal length in the axial direction,Is thatThe focal length in the axial direction,Is the central abscissa of the image width,Is the center ordinate of the image height.

Then adding RGB color information of pixels corresponding to the point in each point cloud to obtain a point cloud with color texturesWherein the information of each point is，Indicating that the point is atThe information of the channel(s),Indicating that the point is atThe information of the channel(s),Indicating that the point is atInformation of the channel.

For a key frame selected as a dense map, a corresponding RGB value may be calculated by traversing all feature points of the key frame and attaching a corresponding color textured point cloud.

In one embodiment, building a dense sub-map from key frames includes: performing filtering pretreatment on the point cloud of the key frame; determining whether the position of the key frame reaches a historical position based on the filtered and preprocessed point cloud; and when the position of the key frame does not reach the historical position, carrying out pose registration according to the point cloud after the filtering pretreatment, and generating a dense sub-map.

Regarding the point cloud filtering processing, due to the influence of environmental factors, the precision of measuring equipment and the surface property of an object to be measured, data inevitably have noise points, so that the point cloud corresponding to a key frame needs to be subjected to filtering preprocessing. Firstly, because the noise of the point cloud collected by the depth camera at the edge and a far place is large, the noise is inaccurate, and in order to input the point cloud regularly and few inaccurate points, the input range of the point cloud is limited, and the point cloud is subjected to direct filtering.

The direct filtering is to filter a certain dimension of the point cloud, such as filtering a certain range of position coordinates or RGB values, traversing each characteristic point in the point cloud, and deleting the characteristic points in the range or deleting the characteristic points outside the range.

Determining position coordinate threshold according to detection range of camera during preprocessing、AndLet it be assumed that byThe point cloud composed of the three-dimensional characteristic points is，Representing a six-dimensional vector. The point cloud after the direct filtering isThe following steps are:

（7）

Formula (7) represents traversal If the characteristic points are satisfied within the three position coordinate threshold ranges, the characteristic points are reserved, otherwise, the characteristic points are filtered out.

Outliers of the point cloud can cause inaccurate or mismatching estimates of the point cloud features, resulting in failure of registration. The purpose of outlier filtering of the point cloud is to exclude outliers in the point cloud.

The radius search filtering speed is high, and feature points which do not have enough adjacent points in a specified radius in the point cloud can be deleted. And before the outlier filtering, the radius searching filtering under loose condition is carried out once, so that outer points with particularly discrete positions in some space are rapidly removed.

And analyzing and counting each characteristic point and the neighborhood thereof in the point cloud by the outlier filtering, and deleting the characteristic points which do not accord with the counting result as outliers. The statistical method is to analyze the distance in the data neighborhood to obtain a filtering result, calculate the average distance from the feature point to the neighborhood point for each feature point, and assume that all feature points will satisfy the gaussian distribution. Firstly searching the neighborhood point of each feature point, and then calculating the distance from each feature point to the neighborhood pointIf there is a commonEach feature point is provided withEach neighborhood point, then，. According to Gaussian distributionModeling the distance parameters, and calculating the point cloud distance mean value and the point cloud standard deviation of all the characteristic points and the neighborhood points thereof by adopting the following method:

（8）

Wherein, Represents the average value of the distance between the point clouds,Representing the standard deviation of the point cloud.

Then, calculating the point distance mean value of each feature point and the neighborhood point as. Traversing all the characteristic points in the point cloud, and removing the characteristic points if the point distance mean value of the characteristic points is not within the confidence coefficient of Gaussian distribution, namely:

（9）

if the formula (9) is not satisfied, the point is an outlier. The point cloud after radius search filtering is obtained after outlier filtering 。

The point cloud corresponding to the key frame can be downsampled according to the image resolution before the input image is converted to improve efficiency and reduce the number of characteristic points in the point cloud, such as generating the characteristic points in an interlaced way. However, in actual measurement, it is found that the data of generating the point cloud by the single frame image is not much, and the real-time property of the composition is not seriously affected, so that in order to avoid distortion, downsampling is not performed, the filtering preprocessing is performed after the point cloud is generated, and the point cloud and the corresponding pose are used for registering the point cloud after the preprocessing.

The point cloud obtained by filtering pretreatment is a rigid point cloud set, and the pose result obtained by SLAM visual positioning optimization is registered into a dense sub-map, and the expression is as follows:

（10）

Wherein, Representing the point cloud after the preprocessing,Representation ofThe number of the characteristic points in the middle,Is the first in the optimization informationThe pose of each point cloud frame is also the corresponding point cloud pose of the image,Representing a point cloud in the world coordinate system.

Unlike sparse map points generated in positioning, the point cloud generated by the screened key frames is dense, and can be directly registered to the three-dimensional dense map of the unmanned aerial vehicle by using pose.

The position error can cause the point cloud ghost of the overlapping area between the front and back key frames. However, since the position drift is not large and the position drift is filtered when the point clouds corresponding to two continuous key frames are registered, double images are not obvious, and the overlapping area between the two key frame point clouds is expressed as two close repeated points for a certain part of observation. Such insignificant ghosts can be removed by means of voxel filtering.

That is, after pose registration according to the filtered and preprocessed point cloud and before generating the dense sub-map, the method further includes: and carrying out voxel filtering on the point cloud after pose registration, wherein in the voxel filtering method, the centroid point of the small cube in the voxel filtering method is used for representing the characteristic point in the small cube.

Typically voxel filtering computes a large cube that just wraps the point cloud according to the input point cloud, partitions the large cube into a number of small cubes according to resolution, and for the point cloud within each small cube, approximates all points within the small cube directly with the center of the small cube.

It should be noted that there is a special case where there are no feature points in the small cube in number or where there are no feature points themselves at the center, so the present invention improves this approximation method by calculating centroid coordinates within the small cube to represent all feature points within the small cube. Assuming that there is a small cube within eachAnd (3) feature points, namely the barycenter coordinates of the small cube are as follows:

（11）

Wherein the method comprises the steps of Is a feature point in the textured point cloud,Is a representative point within a small cube, i.e., the centroid. After all the microcubes are calculated, a set formed by representative points of all the microcubes is obtained and is called a point cloud set, and the point cloud set is a local three-dimensional dense map after voxel filtering, namely a three-dimensional dense map detected by the unmanned aerial vehicle.

In addition, the sparse image of the key frames, the corresponding point cloud and the pose of the key frames registered to the three-dimensional dense map are required to be recorded for reuse in the adjustment of the three-dimensional dense map.

For example, in addition to the insignificant ghost image produced by registering the point cloud generated by the continuous sparse image into the three-dimensional dense map, there is another special case where the pose of the unmanned aerial vehicle needs to be updated while the unmanned aerial vehicle is looped back around the vicinity of the historical pose, overlapping most of the production of partial observations of the previously established map.

The word bag model and the dictionary structure can be used as a method for judging whether loop-back occurs, and a similar frame is searched in a database by a visual word bag vector matching method, and then similarity verification is carried out. And judging the two key frames as loop-back when the similarity function meets the threshold value. And updating the pose of the unmanned aerial vehicle by adopting a Generalized Iterative Closest Point (GICP) algorithm based on KD-Tree nearest neighbor search, wherein the updating content is only limited to updating the current pose estimation of the unmanned aerial vehicle by using the optimal pose obtained by solving.

In the GICP method, the corresponding matching point pairs in two point cloud frames are found out first, and then the optimal pose transformation matrix between the current point cloud and the historical loop point cloud is found out according to the matching point pairs, so that the accumulated error of the dense point cloud map is reduced.

In one embodiment, when the location of the keyframe reaches the historical location: the point cloud after the filtering pretreatment in the key frame is designed into a KD-tree data structure; searching for a matching point pair based on the KD-tree data structure and the point cloud in the key frame corresponding to the history position; calculating the theoretical pose variation between the key frames corresponding to the historical positions by using a GICP method by taking the matching point pairs as input; and updating the pose corresponding to the key frame according to the theoretical pose change quantity.

That is, the pose of the unmanned aerial vehicle (i.e., the pose of the key frame is updated) is updated by the calculated theoretical pose variation, so that the accuracy of the pose of the unmanned aerial vehicle can be improved, and the accuracy of the dense sub-map can be improved. In addition, the pose of the previous unmanned aerial vehicle (namely the pose of the key frame before the current key frame) can be updated according to the updated pose of the unmanned aerial vehicle, so that the map building precision is further improved, and the accumulated error is eliminated.

Specifically, when searching for matching point pairs, in order to improve efficiency, in the invention, a point cloud frame of a target point cloud is designed into a KD-tree data structure, then, from a root node, a left subtree or a right subtree is selected for recursively searching according to coordinates of the point, the latest point found currently is recorded in the searching process, and if a closer point is found in the searching process, the record is updated. After reaching the leaf node of the KD-tree, backtracking to the parent node is started, and whether the subtree on the other side possibly contains a closer point is checked until a corresponding matching point is found. Of course, any one point cloud frame may be used as the source point cloud, and another point cloud frame may be used as the target point cloud.

KD-tree is a data structure for fast approximate nearest neighbor searching of points in multidimensional space. In GICP algorithm, the KD-tree is used for nearest neighbor searching, so that the searching efficiency can be greatly improved.

GICP after matching point pairs are found, the optimization of the transformation matrix is performed using the geometric information between these matching point pairs. For each feature point GICP, its local covariance matrices in the source and target point clouds are calculated, which describe the shape and distribution of the point cloud around that point. And utilizing the corresponding relation and covariance matrix between the point clouds, and utilizing GICP algorithm to solve the optimal rigid transformation (including rotation and translation) so as to minimize the registration error from the source point cloud to the target point cloud. The error metric is typically based on a mahalanobis distance (Mahalanobis distance) that considers the covariance matrix of the points to better address the geometric characteristics of the point cloud. And (3) applying the calculated optimal transformation matrix to the source point cloud, and updating the position of the point cloud. The above is then repeated until an iteration stop condition is met, such as the amount of transformation being less than a certain threshold or a preset number of iterations being reached. And finally outputting the aligned point cloud and the transformation matrix, wherein the aligned state of the source point cloud and the target point cloud after optimal transformation is represented.

In a multi-unmanned aerial vehicle execution task scene, dense sub-maps constructed by each unmanned aerial vehicle are required to be fused, and in the invention, the fusion is specifically based on the overlapping area between the maps.

Firstly, each unmanned aerial vehicle builds a dense sub-map of the unmanned aerial vehicle in real time, and a base station end continuously judges each dense sub-map of incremental growth, namely judges whether map fusion is carried out or not.

If yes, the dense sub-maps are considered to be needed to be fused, and whether an overlapping area exists is judged. If yes, acquiring an overlapping area, searching matching point pairs in the range of the overlapping area based on the octree, then calculating a vector to remove mismatching point pairs, and finally carrying out registration fusion, thereby obtaining the global map.

If not, that is, map fusion is not needed, the dense sub-map is continuously acquired, and then the judging process is continuously executed.

Specifically, the dense sub-map created by the unmanned aerial vehicle continuously grows, when the unmanned aerial vehicle returns to the historical position, the visual dense point cloud map building algorithm can be updated and adjusted, and the scale of the dense sub-map is basically unchanged. When unmanned aerial vehicles are scattered, no overlapping area exists in the global mapping process, but when one unmanned aerial vehicle overlaps with the observation of the other unmanned aerial vehicle, overlapping areas can appear in two constructed dense sub-maps. It is therefore necessary to determine when to turn on the map fusion and find the map overlap area.

As a specific implementation manner, when the dense sub-map constructed by the unmanned aerial vehicle grows to a certain scale, fusing the dense sub-map, that is, before acquiring different dense sub-maps, includes: acquiring feature information of the dense sub-map, and continuing to execute when the feature information meets a preset condition; the feature information comprises the accumulated number of key frames, the length of the pose map, the accumulated time and the number of feature points of the dense sub-map.

In the space strategy, the length of the pose graph of the unmanned aerial vehicle, the accumulated number of key frames input during the construction of the dense sub-map and the number of feature points in the dense sub-map can reflect the increase of the dense sub-map.

Regarding the length of the pose graph of the unmanned aerial vehicle, the pose of the unmanned aerial vehicle at each moment is represented by a node, the pose graph is a graph structure obtained by connecting adjacent nodes by edges by using an undirected graph in graph theory, the nodes in the pose graph are the poses of the unmanned aerial vehicle, and the length of the nodes increases along with the transformation of the poses of the unmanned aerial vehicle.

The pose map length generally refers to the number of nodes in the pose map, each node representing the pose (position and orientation) of the drone at a time. The pose graph shows the movement path of the unmanned aerial vehicle in the environment through relative movement or observation between nodes (poses) and edges (poses). Whether the length of the pose map is the same as the number of key frames depends on the processing mode of the method, one is to update the pose map only when the key frames appear, and the other is to update the pose map in real time.

In terms of time strategy, fusion can be started for the current sub-graph when the accumulated time from the beginning of building the dense sub-map to the current moment reaches a threshold.

Therefore, the space strategy and the practice strategy are combined to be jointly used as the judgment condition for the fusion opening of the dense sub-map, and the mathematical method is used for representing the condition for the fusion judgment of the dense sub-map as follows:

（12）

Wherein, The length of the pose graph is represented,Representing the length threshold value of the pose graph,Representing the cumulative number of key frames,Representing a threshold value for the cumulative number of key frames,Representing the cumulative time threshold value,The cumulative time is indicated as a function of the time,Representing the number of points in the dense sub-map,Representing a dense sub-map point threshold.

The length of the pose graph and the accumulated time expression are as follows:

（13）

Wherein, Finger numberThe pose of each key frame, i.e. the pose transformation matrix,Finger numberThe pose of the key frames is determined,Indicating the current time of day and,Representing the dense sub-map construction start time.

When fusion is needed, an AABB (Axis-Aligned Bounding Box) algorithm is used for generating an AABB bounding box aligned with the coordinate Axis at the periphery of the filtered point cloud, whether the dense sub-maps are overlapped or not is rapidly judged, and an overlapping area is roughly found out.

After the overlapping areas among different dense sub-maps are found, all map points of the dense sub-map to be fused can be prevented from being traversed, and calculation time is saved. The point cloud based on the overlapping area needs to establish the association of the points when registering and fusing.

In the embodiment of the invention, searching the matching point pair in the overlapping area comprises the following steps: the feature points in one overlapping region are inserted into the octree data structure and the pairing points of the feature points in the other overlapping region are found from the octree data structure.

Exemplary, assume that the point sets of the overlapping areas of two dense sub-maps are respectivelyAndFor a pair ofCreating octree samples to formIs then selectedAnd searching the pairing points corresponding to the characteristic points based on the octree data structure, and finally finding all the pairing point pairs.

A situation arises at this time: multiple onesOne for each point in (a)When the number of points in the voxel is not matched, the voxel is taken to be matched withThe nearest point to the corresponding point in the list is taken asIs a matching point of the point in the above. Matching points of the overlapping area are quickly found through the octree structure:

（14）

Wherein, Representation ofIn (c) is a point of the matrix,Representation ofIs a point in (a).

As a specific implementation form, screening the matching point pairs based on normal vectors of the matching point pairs includes: calculating normal vectors of two feature points in the matching point pair by adopting a principal component analysis method, namely establishing a local plane of the feature point based on a neighborhood point of the feature point; calculating a normal vector of the local plane, and taking the normal vector of the local plane as a normal vector of the feature point; calculating an included angle of normal vectors of the two feature points; and deleting the corresponding matching point pair, namely deleting the matching point pair formed by the two characteristic points when the included angle of the normal vectors of the two characteristic points is larger than a preset included angle threshold value.

That is, after the matching point pair is obtained, the normal vector included angle of the two feature points in the matching point pair is calculated as a screening condition, that is, the included angle of the normal vectors of the two feature points in the matching point pair.

In one embodiment, normal vectors of two feature points in a point cloud are determined using Principal Component Analysis (PCA), which estimates the surface geometry of the points by analyzing the covariance matrix of the local neighborhood of feature points to extract principal feature vectors and feature values. First, for each feature point in the point cloudA set of neighborhood points around it is selected. The set of neighborhood points may be selected by designating all points within a radius, or selecting the nearest point theretoEach neighborhood point. These points constitute feature pointsIs defined in the local neighborhood of (c). If it isIs centroid, covariance matrixThe writing is as follows:

（15）

And calculating eigenvalues and corresponding eigenvectors of the covariance matrix. The eigenvalues represent the degree of dispersion of the data in the direction of the corresponding eigenvectors. Among the resulting feature vectors, the feature vector corresponding to the minimum feature value will be the normal vector of the local surface of the feature point. This is because the direction of the minimum feature value represents the direction in which the data change is smallest, i.e., the normal direction of the local point cloud surface.

If the normal vector included angle of the two matching points is larger than a preset threshold valueThe matching point pair is removed because a normal vector angle greater than the threshold value means that the two matching points differ significantly in geometric surface structure, i.e., a large angle may indicate a mismatching, which, although spatially close, may actually fall into disparate objects or map features. The normal vector included angle calculating method comprises the following steps:

（16）

Wherein, Representing normal vectorAndIs included in the bearing.

After the matching point pair is found, the optimal European transformation can be calculated by utilizing GICP registration method, so that dense sub-map fusion is completed, and a global map is obtained.

The invention also discloses a real-time multi-terminal map fusion device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the method when executing the computer program.

The invention also discloses a computer readable storage medium, which stores a computer program, and the computer program realizes the steps in the above method embodiments when being executed by a processor.

The present invention also provides a computer program product which, when run on a data storage device, causes the data storage device to perform the steps of the various method embodiments described above.

The integrated unit modules, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above-described embodiments, and may be implemented by a computer program to instruct related hardware, and the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of the method embodiments described above. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying the computer program code to the storage device, a recording medium, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a U-disk, removable hard disk, magnetic or optical disk, etc.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical units, may or may not be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Claims

1. The real-time multi-terminal map fusion method is characterized by comprising the following steps of:

acquiring two different dense sub-maps;

2. The method of claim 1, wherein searching for matching point pairs in the overlapping region comprises:

Inserting the feature points in one overlapping area into an octree data structure, and searching the matching points of the feature points in the other overlapping area from the octree data structure.

3. The method of claim 1, wherein the screening the matching point pairs based on normal vectors of the matching point pairs comprises:

Calculating normal vectors of the two characteristic points in the matching point pair by adopting a principal component analysis method;

Calculating an included angle of normal vectors of the two feature points;

And deleting the corresponding matching point pair when the included angle of the normal vectors of the two characteristic points is larger than a preset included angle threshold value.

4. The method of claim 1, wherein prior to obtaining two different dense sub-maps comprises:

Acquiring the characteristic information of the dense sub-map, and continuing to execute when the characteristic information meets a preset condition; the feature information comprises the accumulated number of key frames, the length of the pose map, the accumulated time and the number of feature points of the dense sub-map.

5. The method for merging real-time multi-terminal maps according to any one of claims 2 to 4, wherein the method for constructing the dense sub-map comprises the following steps:

and establishing the dense sub-map according to the key frames.

6. The method of claim 5, wherein creating the dense sub-map from the key frames comprises:

Performing filtering pretreatment on the point cloud of the key frame;

Determining whether the position of the key frame reaches a historical position or not based on the filtered and preprocessed point cloud;

7. The method of claim 6, wherein the step of performing pose registration according to the filtered and preprocessed point cloud and before generating the dense sub-map further comprises:

voxel filtering is performed on the point cloud after pose registration.

8. The method of claim 7, wherein voxel filtering the point cloud after pose registration comprises:

9. The method of claim 8, wherein when the location of the key frame reaches a historical location:

designing the point cloud after the filtering pretreatment in the key frame into a KD-tree data structure;

Calculating the theoretical pose change quantity between the key frames corresponding to the historical positions by using the key frames and a GICP method by taking the matching point pairs as input;

And updating the pose corresponding to the key frame according to the theoretical pose variation.

10. A real-time multi-terminal map fusion device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-9 when executing the computer program.