CN116168171A

CN116168171A - Real-time dense reconstruction method for clustered unmanned aerial vehicle

Info

Publication number: CN116168171A
Application number: CN202310188327.5A
Authority: CN
Inventors: 布树辉; 董逸飞; 张�杰; 李坤; 夏震宇; 翁乐安; 陈霖; 韩鹏程; 曹雪峰; 李科
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2023-03-02
Filing date: 2023-03-02
Publication date: 2023-05-26

Abstract

The invention provides a real-time dense reconstruction method of a cluster unmanned aerial vehicle, which is based on GNSS and VIO for relative positioning, and a plurality of unmanned aerial vehicles carrying cameras, IMU and GNSS are used for scanning a target area to complete collaborative reconstruction. The relative positioning mode is realized by a dynamic distributed network DDN. The DDN adjusts the structure of the distributed network in real time by using the prior relative position information, and dense reconstruction is completed by using accurate local relative pose information for each sub-module of the distributed structure. And setting a central node in each sub-module, and estimating the pose between virtual central nodes of each sub-module to fuse point clouds. And a hierarchical point cloud fusion algorithm is adopted to perform point cloud registration between unmanned aerial vehicles in the sub-modules and between different sub-modules of the whole distributed network respectively so as to obtain a registered high-precision point cloud map and an accurate relative pose between the unmanned aerial vehicles. Compared with a general point cloud registration method, the point cloud registration method is more robust, and the registration accuracy and the relative pose accuracy between the recovered unmanned aerial vehicles are higher.

Description

Real-time dense reconstruction method for clustered unmanned aerial vehicle

Technical Field

The invention relates to the field of real-time three-dimensional dense reconstruction of unmanned aerial vehicles, in particular to a method for reconstructing a cluster unmanned aerial vehicle real-time three-dimensional dense.

Background

The real-time three-dimensional dense reconstruction technology has increasingly increased demands in aspects of rescue and relief, agricultural monitoring, urban planning and the like, and is one of the basic technologies for future digital contracture urban development.

In order to realize the technology, firstly, accurate pose estimation of the unmanned aerial vehicle is required to be realized, and dense point cloud information is generated from sensor data through a reconstruction algorithm. For the reconstruction task in a smaller range, a single unmanned aerial vehicle can be utilized to conduct global route planning, acquire sensor information and complete reconstruction. However, due to the limited navigation time of a single unmanned aerial vehicle, the reconstruction method is difficult to reconstruct a scene with a larger range in a single flight, and due to the fact that the single unmanned aerial vehicle executes a task, the method has the defects of poor maneuverability, low efficiency and the like under a complex task, and the time for completing the reconstruction task is limited by the time of the flight route of the single unmanned aerial vehicle. At present, a method capable of realizing collaborative three-dimensional dense reconstruction of a plurality of unmanned aerial vehicles mostly depends on a SfM (Structure from Motion) algorithm, and the algorithm can only acquire data and then perform dense reconstruction offline and cannot perform three-dimensional dense reconstruction in real time.

The method for collaborative mapping of the large scene of the clustered unmanned aerial vehicle based on SLAM (simultaneous localization and mapping) technology can be used for carrying out three-dimensional reconstruction in real time in the flight process of the unmanned aerial vehicle, and can greatly improve the efficiency of dense reconstruction. VIO (visual-inertial Odometry) is a SLAM system that fuses visual sensor and IMU (Inertial Measurement Unit) data, and although this positioning approach can achieve high positioning accuracy locally, it is difficult to eliminate accumulated errors in a large scale scene without repeated paths.

At present, the real-time three-dimensional dense reconstruction technology of a single unmanned aerial vehicle is mostly based on unmanned aerial vehicle route planning and dense reconstruction algorithms, and the main defect of the algorithms is that the efficiency of the single unmanned aerial vehicle for collecting data and constructing images is limited by the standby time and the equipment quantity of the single unmanned aerial vehicle, and the real-time requirement cannot be met in most occasions. The existing cluster unmanned aerial vehicle technology is biased to positioning and path planning, often cannot generate a dense point cloud map in real time or can only generate a sparse point cloud map, and the sparse point cloud map often cannot fully express the environment and cannot meet the requirements in most occasions. Although the three-dimensional reconstruction method of the SfM can acquire data by utilizing a plurality of unmanned aerial vehicles to obtain an accurate three-dimensional point cloud map, the calculation power consumption is huge, and the real-time requirement cannot be met. The existing SLAM algorithm has huge calculation power consumption in the application of the co-positioning and mapping of a plurality of devices, and can not meet the requirement of incremental increase of the devices for three-dimensional dense reconstruction of a plurality of unmanned aerial vehicles.

"CN 201910800208.4-Multi-agent cooperative localization and mapping method and apparatus [ ZH ]" provides a multi-agent cooperative reconstruction scheme. The scheme relies on accurate positioning of single agents and a point cloud fusion technology, common view information among the agents is not fully utilized, and the problems of poor reconstruction instantaneity, poor robustness of the point cloud fusion method and the like exist.

Ensuring the accuracy of dense reconstruction while ensuring the real-time performance of multi-unmanned aerial vehicle collaborative dense reconstruction is the core of the problem. How to extend to the collaborative and real-time reconstruction of a plurality of unmanned aerial vehicles on the basis of dense reconstruction of a single unmanned aerial vehicle, and fully exert the advantages of the clustered unmanned aerial vehicles is a problem to be solved.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a method for real-time dense reconstruction of a cluster unmanned aerial vehicle, which comprises the following steps:

step 1: after receiving a three-dimensional dense reconstruction task and entering a route, a plurality of unmanned aerial vehicles in the cluster unmanned aerial vehicle independently initialize VIO and autonomously position to obtain VIO pose estimation information of each unmanned aerial vehicle; the unmanned aerial vehicle is provided with a GNSS, a visual sensor and an IMU;

the method comprises the steps that the common view relation among unmanned aerial vehicles is reversely pushed according to GNSS information of each unmanned aerial vehicle, and the unmanned aerial vehicles are divided into map building subgroups according to the common view relation; each map building subgroup comprises a plurality of unmanned aerial vehicles with a common view relationship; then initializing the relative pose of the unmanned aerial vehicle in the map building sub-group according to GNSS information, and determining a central unmanned aerial vehicle by utilizing an adjacent relation;

step 2: according to the VIO pose estimation information of each unmanned aerial vehicle obtained in the step 1, global pose optimization is carried out by combining GNSS information, the scale of local positioning information and the position of global coordinates are updated in real time, and the positioning information of the recovered scale is aligned with an image time stamp; updating the relative pose among the unmanned aerial vehicles in each subgroup by using the initial coordinates of each unmanned aerial vehicle;

step 3: detecting the overlapping rate of images obtained by each unmanned aerial vehicle in the image building subgroup, if the overlapping rate meets the set threshold requirement, taking a relative pose result which is jointly optimized with GNSS information as an initial value, taking joint observation as constraint, and further optimizing the relative pose between unmanned aerial vehicles;

step 4: performing dense three-dimensional reconstruction by using VIO pose estimation information of a single unmanned aerial vehicle and relative pose and picture information among the unmanned aerial vehicles processed in the step 3, and generating a dense three-dimensional point cloud map;

step 5: after each unmanned aerial vehicle obtains a dense point cloud map through the step 4, according to the relative pose of each unmanned aerial vehicle, transforming the dense point cloud generated by each unmanned aerial vehicle into the same coordinate system to realize point cloud fusion;

step 6: carrying out point cloud registration on dense point cloud maps built by different unmanned aerial vehicles in each map building subgroup by using a layered point cloud fusion registration algorithm; after registering the point cloud maps in the subgroups, updating the relative pose of the unmanned aerial vehicle in each subgroup; then, carrying out fusion registration on the point cloud map constructed and fused by each sub-group, and updating the relative pose of the central unmanned aerial vehicle among different sub-groups;

step 7: after the accurate registration is completed, updating the relative pose of each unmanned aerial vehicle by using registration information, and judging whether the sub-group needs to be divided again by using the relative pose information; if the sub-group needs to be re-divided, return to step 1 and re-divide and initialize the dense reconstruction sub-group with these relative poses.

Further, the process of initializing the VIO and performing autonomous positioning in step 1 is as follows: firstly, aligning an initial coordinate system of the VIO with gravity by using IMU information on the unmanned aerial vehicle, and solving the posture of the unmanned aerial vehicle by using the transformation between the initial static IMU information and a gravity vector; then, forming multi-frame image information into a sliding window for local batch optimization, making a SfM with pure vision in the sliding window, recovering a camera track without scale, and obtaining an IMU track by using IMU information integration; finally, the camera track and the IMU track are aligned.

Further, the specific process of the step 2 is as follows: firstly, acquiring VIO local pose and GNSS information of an unmanned aerial vehicle, and aligning time stamps of the VIO local pose and the GNSS information; then, constructing a global optimization problem by utilizing the VIO local pose and GNSS information, so that the residual error between the VIO local pose and the GNSS information constructed by the VIO local pose and the GNSS information is minimum, wherein variables to be optimized comprise the global pose of the unmanned aerial vehicle and the global scale of the VIO estimated pose; then solving an optimization problem and obtaining transformation from a local coordinate system to a global coordinate system through calculation; and converting the VIO estimation pose after the scale recovery into the camera pose by using the external parameter matrix, and performing time stamp alignment on the camera pose and the received picture.

Further, in the specific method for calculating the overlapping rate of the images acquired by the two unmanned aerial vehicles at the same moment in the step 3, a homography matrix is calculated through matching points of the images, so that transformation between the two cameras is recovered, then the vertex of one image is transformed to the other image, and the proportion of the area of the intersection of polygons to the image is calculated.

Further, in step 3, the overlapping rate of the images acquired by the unmanned aerial vehicles in the same subgroup is compared, and if the overlapping rate meets the set threshold requirement, the relative pose between the cameras is further optimized by utilizing the common view information: constructing a Sim3 problem, and optimizing the relative pose obtained in the step 2; and the transformation and the scale between the two coordinate systems are recovered by utilizing the coordinates of the three pairs of matched points under the two coordinate systems.

Further, in the step 4, a dense stereoscopic vision matching algorithm is used for fast stereoscopic matching of adjacent frames and frames with the overlapping rate within a set range at the same moment, so that a depth map is obtained; filtering the obtained depth map, and performing continuous image consistency check to obtain a depth map with higher precision; and after consistency check, removing redundant points in adjacent frames to generate a dense point cloud map.

Further, in step 5, a dense three-dimensional point cloud map constructed by each unmanned aerial vehicle and a real-time relative pose between unmanned aerial vehicles are respectively received, and point cloud fusion is performed depending on the relative positioning between each unmanned aerial vehicle:

for any point of dense point cloud reconstructed by kth unmanned aerial vehicle

Fusing the map with a point cloud map built by a central unmanned plane, wherein the map is +.>

And downsampling the fused dense point cloud map by adopting a voxelization method.

Further, in step 5, two-dimensional point cloud position segmentation is performed by using positioning information, and point clouds are respectively stored in different point cloud blocks; the index of the point cloud block is found by means of the following equation:

x＝P _x /d _cell

y＝P _y /d _cell

wherein P is _x ,P _x Is the position of the coordinates of the point cloud on the x, y plane under the central coordinate system; d, d _cell Is the width of the point cloud.

Further, in step 6, the hierarchical point cloud fusion registration algorithm is: firstly, determining the gravity direction of the point cloud, and then partitioning the point cloud in the gravity direction of the point cloud; for each point cloud, each point is projected onto the same plane vertical to the gravity direction, firstly, rough registration is carried out by utilizing line-plane characteristics of two-dimensional compression point clouds of different layers, and clustering is carried out on the registered transformation results to obtain the most probable transformation results; and performing coarse transformation on the two dense point clouds by using the result, and then further performing fine registration on the dense point clouds by using an ICP algorithm in combination with three-dimensional point information of corresponding feature points of the image.

Further, in step 6, when the IMU and GNSS information are missing or the information obtained by the IMU and GNSS is unreliable, the gravity direction calculation is performed by means of the point cloud information: first, curvature calculation is performed on two groups of point clouds, and the normal direction of the direction with the largest surface characteristics is selected as the gravity direction.

Advantageous effects

Compared with the prior art, the invention has the following beneficial effects:

1. compared with the prior method, the method improves the VIO algorithm, and has the advantages that the pre-optimized outlier removing strategy is adopted, so that the method is more robust in the process of constructing and solving the optimization problem, and the characteristic point selection and variable sliding window strategy can be suitable for occasions with limited computational resources. Under a large-scale scene, the unmanned aerial vehicle can effectively complete the estimation of the pose of the unmanned aerial vehicle in real time. In order to meet the requirements of a computing platform, the improved algorithm of the invention can not only process pose estimation of each unmanned aerial vehicle in a centralized manner on one device, but also independently estimate the pose on each unmanned aerial vehicle.

2. The multi-unmanned aerial vehicle relative positioning method used by the invention depends on a local accurate visual inertial odometer, GNSS information and a common-view relation among unmanned aerial vehicles. Compared with the existing multi-unmanned aerial vehicle relative positioning method, the method provided can increase the number of unmanned aerial vehicles for drawing in an incremental mode, and can perform global pose optimization and relative pose calculation of each unmanned aerial vehicle in real time on the occasion that computational resources are very limited. When facing dense reconstruction tasks, the method can provide accurate positioning on the premise of relatively low calculation cost.

3. The hierarchical point cloud registration algorithm provided by the invention can accurately register the dense point cloud map. Even in a scene with a dense point cloud map with relatively large noise, the method can still work, and has relatively strong robustness.

4. Compared with the conventional dense reconstruction method of a single unmanned aerial vehicle, the dense reconstruction method of the unmanned aerial vehicle cluster greatly improves the reconstruction efficiency, and can carry out collaborative dense reconstruction of multiple unmanned aerial vehicles in real time. When facing the more complex dense reconstruction task, the method has higher flexibility, and the efficiency and the precision of the drawing can be improved by reasonably distributing the task to each unmanned plane.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:

FIG. 1 is a general flow of the present invention;

FIG. 2 is a schematic diagram of the present invention dividing dense reconstruction sub-modules (subgroups);

FIG. 3 is a schematic flow diagram of a clustered drone real-time large scale dense reconstruction in a dense reconstruction subgroup of the present invention;

FIG. 4 is a schematic flow chart of the invention for realizing autonomous positioning of a single unmanned aerial vehicle;

FIG. 5 is a schematic flow chart of the present invention for implementing real-time dense reconstruction;

FIG. 6 is a flow of hierarchical point cloud registration of the present invention;

Detailed Description

Aiming at the problem of ensuring the real-time performance of the collaborative dense reconstruction of multiple unmanned aerial vehicles and simultaneously ensuring the dense reconstruction precision, the invention provides a method for the real-time dense reconstruction of a large scene of a cluster unmanned aerial vehicle, which can realize the collaborative real-time large-scale three-dimensional dense reconstruction of the multiple unmanned aerial vehicles and effectively improve the efficiency of the dense reconstruction of the large-scale scene while ensuring the precision.

The cluster unmanned aerial vehicle is positioned relatively based on GNSS (Global Navigation Satellite System) and VIO, and a plurality of unmanned aerial vehicles carrying cameras, IMU and GNSS are used for scanning a target area, so that collaborative reconstruction is completed. The relative positioning mode combining both GNSS signals and co-view information is achieved here through a dynamic distributed network DDN (Dynamic Distributed Network). The dynamic distributed network DDN adjusts the structure of the distributed network in real time by using prior relative position information, and dense reconstruction is completed by using accurate local relative pose information for each sub-module of the distributed structure. And setting a central node in each sub-module, and estimating the pose between virtual central nodes of each sub-module to fuse point clouds. In order to ensure the consistency of the global point cloud map, a layered point cloud fusion algorithm is adopted to conduct point cloud registration between unmanned aerial vehicles in the sub-modules and between different sub-modules of the whole distributed network respectively, so that a registered high-precision point cloud map and an accurate relative pose between the unmanned aerial vehicles are obtained. Compared with a general point cloud registration method, the point cloud registration method is more robust, and the registration accuracy and the relative pose accuracy between the recovered unmanned aerial vehicles are higher.

The method specifically comprises the following steps:

step 1: after receiving the three-dimensional dense reconstruction task and entering the route, the plurality of unmanned aerial vehicles independently initialize and autonomously position the VIO to obtain the VIO pose estimation information of each unmanned aerial vehicle; the process of initializing and autonomous positioning of the VIO alone here is: and constructing an optimization problem by utilizing the matching information of the feature points between the adjacent image frames and the IMU integral information, and solving by using a ceres solver to obtain VIO pose estimation information of each unmanned aerial vehicle.

The specific process is as follows: the initial coordinate system of the VIO is aligned with gravity by using the IMU information on the unmanned aerial vehicle, the posture of the unmanned aerial vehicle at the moment is solved by using the transformation between the IMU information of initial rest and the gravity vector, and the initial coordinate system is set to be a FLU (Front Left Up) coordinate system. And forming 10 frames of image information into a sliding window for local batch optimization, performing pure-vision SfM in the sliding window, recovering a camera track without a scale, and obtaining an IMU track by using IMU information integration. Finally, the camera track and the IMU track are aligned.

The key technology of the step is to accurately position a single unmanned aerial vehicle:

firstly, in order to cope with the scene of insufficient onboard calculation force of the unmanned aerial vehicle, a strategy based on FAST feature points is added at the front end, and the processing speed can be doubled by the positioning algorithm of the feature points through comparison. The corner points detected by the FAST method are sometimes distributed densely, while feature points which are too close are very similar to the constraint provided by the feature points in the BA optimization, and the constraint provided by the feature points is lower in cost performance than the cost-effective constraint. In the invention, an optimal strategy of the point cluster is preferably provided, after a feature point is found for a single-channel picture (gray level diagram), the feature point and a certain number of adjacent points form a cluster, the quality of the cluster point is compared, the point with the highest quality is selected, the center of gravity of the group of points is taken as the center of a circle, a mask is drawn by taking the point farthest from the center of the circle in the cluster point as the radius, and other points in the mask are discarded. The quality and uniformity of the feature points are effectively guaranteed by means of the method. In the feature point screening of the multi-channel picture, feature point extraction is performed on each channel of the picture respectively, and then a point cluster optimization strategy is executed in a concentrated mode.

The vision-inertia based bundle adjustment equation is used in the VIO optimization section. In the optimized residual construction part, three types of residual, namely visual residual, IMU residual and prior residual items formed by sliding window sliding marginalization are mainly constructed. Let the window size be k+1, the overall optimization formula is as follows:

wherein e _p And H is _p Is a priori information from marginalization. J is a set of visual observation information, k ₁ ,k ₂ E, K is the size of the sliding window,

is the residual error generated by re-projection is commonly observed between two image frames, W _r Is an information matrix of visual residuals. />

Residual error representing IMU observation information construction, W _s ^k Is an information matrix of IMU residuals. The specific construction of the residual will be discussed in detail later.

The variable X to be optimized in the optimization can be expressed as follows

X＝[X ₀ ，X ₁ ，…X _n ，λ ₀ ，λ ₁ ，…λ _m ，t _d ]

X in the variable to be optimized _k Status parameters representing each frame, including the position of each frame within the window

Speed->

Rotate->

Acceleration zero bias and angular velocity zero bias b of IMU _a ,b _g 。λ _m Representing the inverse depth of all feature points within the window. t is t _d Representing time offset is mainly caused by the IMU information not being time-stamped with the image information.

In order to reduce the influence of the outlier on the algorithm, the robustness of the algorithm is further improved, and a Huber robust kernel function is added to the visual residual term. Huber norm is defined as follows:

visual residual:

the plane points are all assumed to lie on the normalized plane. The same three-dimensional points observed by the two frames are respectively on the normalized plane

Then +.>

Point re-projection to +.>

And (3) constructing a residual function on a camera normalization plane of the frame. To obtain three-dimensional point coordinates->

Requiring recovery

Coordinates of the point in the camera system of the i-th frame and projecting it into the camera coordinate system of the j-th frame:

wherein T is _bc Representing camera out-parameters, i.e. camera to IMU variations,

representing the transformation of the i-th frame body coordinate system to the initial world coordinate system, i.e. the pose of the i-th frame body in the world coordinate system, +.>

Representing the transformation of the j-th frame world coordinate system into the body coordinate system,/for>

Representing the transformation of the body system into the camera coordinate system. The visual residual may ultimately be expressed as:

IMU residual:

the IMU is able to obtain acceleration and angular velocity information through accelerometers and gyroscopes, and can obtain a transformation of rotation and displacement between two frames through integration. In the case of back-end nonlinear optimization, the pose needs to be optimized, and each time the pose needs to be adjusted, IMU measurements need to be re-transferred between them, which can be time consuming. To avoid re-passing the measurement values, a pre-integration strategy is adopted. The pre-integration strategy is adopted to avoid repeated integration of the observation information of the IMU after the variable to be optimized is adjusted. According to the IMU pre-integration formula, there are

Wherein, the liquid crystal display device comprises a liquid crystal display device,

which are the integral of the displacement, velocity and attitude transformation between the two frames, respectively. />

Is a measurement of IMU, +.>

Zero offset, n, of IMU _a ,n _w White noise of IMU, +.>

Is the pose of the body coordinate system at time t relative to the previous frame, +>

Is in the form of a quaternion of the body coordinate system at time t relative to the pose of the previous frame. The IMU residual can thus be constructed as follows:

in the method, in the process of the invention,

respectively a displacement residual error, a speed residual error, a gesture residual error and an IMU zero offset residual error. The variables to be estimated have been described above, here:

bias is used in the pre-integral term. If bias is updated once at the time of optimization, then the pre-integral term in the residual term is recalculated in the next iteration, which needs to be avoided in an effort. To solve this problem, an error propagation model of ESKF is used. By this error transfer model, not only the problem of bias update is solved, but also a covariance matrix (information matrix) is calculated by the way.

A priori terms:

after each optimization, the data of the oldest frame or the data of the next new frame is discarded when the sliding window slides. If these constraints are directly discarded, the accumulation of residuals will result in a large accumulated error. Extracting constraint information as a priori term for adding to the subsequent optimization can greatly suppress accumulated errors while reducing computational complexity.

The marginalization operation utilizes the schulpe, we construct a new prior based on all marginalization metrics associated with the removed state. The Shulting operation is performed on the removed frames, and constraint information is then contained in the delta equation

In (I)>

Is the matrix after the schulking operation of the sea plug matrix, which is still a symmetric matrix.

By means of

Can recover the Jacobian matrix +.>

Residual e ₀ 。

To update

Is obtained by derivation

Where dx is the difference between the updated state quantity and the marginalized state quantity.

In order to obtain better robustness, a pre-optimization strategy is adopted before optimization, and visual features with overlarge residual errors are removed through pre-optimization. In addition, some means have been taken to prevent VIO system failure. For IMU observation information, it is estimated whether the value of the information observed each time exceeds a threshold. If the threshold is exceeded, the data is considered as a frame of relatively bad data, and the data is directly discarded. For visual features, if the continuously tracked features are too few, the VIO system is considered to have failed, at which point the VIO system may be restarted.

The significance of determining dense reconstruction subgroups is to determine whether to further optimize the relative pose using common view information and to perform a three-dimensional dense reconstruction. Therefore, the common view relation among the unmanned aerial vehicles is reversely pushed according to the GNSS information of each unmanned aerial vehicle, and the unmanned aerial vehicles are divided into map building subgroups according to the common view relation. Each map building sub-group comprises a plurality of unmanned aerial vehicles with a common view relationship. And initializing the relative pose of the unmanned aerial vehicle in the map building subgroup according to GNSS information, and voting out a central unmanned aerial vehicle by utilizing the adjacent relation, wherein the central unmanned aerial vehicle is the unmanned aerial vehicle with the largest common-view relation in the subgroup.

Step 2: and (3) according to the VIO pose estimation information of each unmanned aerial vehicle obtained in the step (1), carrying out global pose optimization by combining with GNSS information, updating the scale of the local positioning information and the position of the global coordinates in real time, and aligning the positioning information of the recovered scale with the image time stamp.

And updating the relative pose between the unmanned aerial vehicles in each subgroup by using the initial coordinates of each unmanned aerial vehicle, for example, updating the relative pose between the No. 1 unmanned aerial vehicle and the No. 2 unmanned aerial vehicle in a subgroup, wherein the formula is as follows

Wherein->

Respectively representing the pose of two unmanned aerial vehicles in the world coordinate system,/->

And (5) representing the relative pose of the two unmanned aerial vehicles in the world coordinate system.

This step is to obtain a more accurate global pose of the unmanned aerial vehicle and a relative pose between the unmanned aerial vehicle. Firstly, the VIO local pose of the unmanned aerial vehicle and GNSS information are acquired, and time stamps of the VIO local pose and the GNSS information are aligned. And then constructing a global optimization problem by utilizing the VIO local pose and GNSS information, so that the residual error between the VIO local pose and the GNSS information constructed by the VIO local pose and the GNSS information is minimum, wherein the variables to be optimized comprise the global pose of the unmanned aerial vehicle and the global scale of the VIO estimated pose. The optimization problem is then solved using teres and the transformation between the local coordinate system to the global northeast-north-day coordinate system (ENU coordinate system) is obtained by calculation. And converting the VIO estimation pose after the scale recovery into the camera pose by using the external parameter matrix, and performing time stamp alignment on the camera pose and the received picture.

The key technology in this step is GNSS loose coupling optimization.

1) Optimizing pose and scale

In order to realize the relative positioning of the multiple unmanned aerial vehicles and restore the scale of the positioning and point cloud of the multiple unmanned aerial vehicles, a GNSS loose coupling algorithm is introduced. In the last step, the local pose is obtained, the local pose solved by the VIO part is coupled with the observation information obtained by the GNSS, the scale of the relative pose can be recovered, and the current global pose is optimized. In addition, in order to realize multi-device cooperative dense reconstruction, the relative pose of the multi-unmanned aerial vehicle is restored by using GNSS information.

Firstly, initializing a GNSS loose coupling system, taking acquired GNSS information of a first frame as initial ENU coordinates, and recording a VIO initial local coordinate system at the same moment. The GNSS information that is subsequently fetched will be converted to a transformation to the original GNSS coordinate system using the GeographicLib. The VIO local pose information will be converted into a transformation to the original local coordinate system.

After obtaining the VIO information and the GNSS information, the time stamps of the two information are aligned first, then an optimization problem is constructed, the global pose is optimized, and the absolute scale information is recovered. Assuming that the scale is denoted by s, the initial value of s is solved before optimization.

representing the displacement of the GNSS coordinates of the current frame from the initial ENU coordinates. />

Representing the displacement of the current frame in the VIO local coordinate system,/->

Representing the initial position in the VIO local coordinate system. />

And constructing a GNSS loose coupling optimization function. The optimization function consists of two parts, one part is the constraint of the inter-frame transformation of the VIO local coordinate system

Wherein e _q Is rotation constraint brought by VIO estimation pose; e, e _t Representing displacement constraints brought by the pose of the VIO estimation;

respectively representing pose transformation between two adjacent coordinates aligned with a GNSS timestamp under a VIO local coordinate system; s is the scale to be optimized; />

Respectively representing pose transformation between two adjacent coordinates under the northeast coordinate system, and the pose transformation is also the quantity to be optimized; />

Is an operation on the quaternion error state.

The other part is the constraint imposed by the GNSS observations.

is the displacement of the current frame to the GNSS initial coordinate system, and (2)>

Is the displacement of the current frame in the world coordinate system, s is the scale to be optimized.

2) Using GNSS loose coupling algorithm to complete relative positioning among multiple unmanned aerial vehicles

Determining the relative position between the plurality of drones determines the relative position of the initial coordinate system between the plurality of drones. The pose of the central unmanned aerial vehicle is expressed as

The rotation from the local coordinate system of the central unmanned plane VIO to the world coordinate system (ENU coordinate system) can be expressed by the formula +.>

And (5) calculating to obtain the product. />

Representing the longitude and latitude and altitude of the central unmanned aerial vehicle GNSS raw coordinate system (ENU coordinate system). For any unmanned aerial vehicle k, its pose is defined as +.>

The rotation of the drone to the central drone is +.>

Displacement is t _0k By->

Calculated by using GeographicLib. For convenience sakeDerived later, using T _0k And (5) representing the transformation from the initial coordinate system of the kth unmanned aerial vehicle to the initial coordinate system of the central unmanned aerial vehicle.

In large-scale long-time operation, the GNSS loose coupling algorithm can effectively reduce positioning drift. By updating relative positioning between multiple drones

Wherein->

And in the map building part, a dense point cloud map can be fused according to the real-time relative position and the initial relative position of the unmanned aerial vehicle so as to realize the collaborative three-dimensional dense reconstruction of the unmanned aerial vehicles.

Step 3: detecting the overlapping rate of images obtained by each unmanned aerial vehicle in the map building sub-group, if the overlapping rate meets the set threshold requirement, taking a relative pose result after combined optimization with GNSS information as an initial value, taking joint observation as constraint, and further optimizing the relative pose between unmanned aerial vehicles.

The specific process is as follows: and calculating the overlapping rate of the pictures acquired at the same time by different unmanned aerial vehicles in the same subgroup. If the overlapping rate is higher, searching the two pictures for common observation through a characteristic point matching algorithm, and solving the Sim3 transformation by utilizing the common observation (more than three pairs of three-dimensional points matched in two frames), so that the relative pose and the relative scale between the unmanned aerial vehicles are further optimized.

The key technology in the step is to couple common view information:

the specific method for calculating the overlapping rate of the images acquired by the two unmanned aerial vehicles at the same moment is to calculate a homography matrix through matching points of the images so as to recover transformation between the two cameras, then transform the vertexes of one image onto the other image, and calculate the proportion of the area of the polygon intersection to the image.

And comparing the overlapping rate of the images acquired by the unmanned aerial vehicles in the same subgroup, wherein the overlapping rate meets a certain threshold requirement, and further optimizing the relative pose between the cameras by utilizing the common view information. Building a Sim3 problem, and optimizing the relative pose obtained in the last step. And recovering the transformation and the scale between the two coordinate systems by utilizing the coordinates of the three pairs of matched points under the two coordinate systems. The scale and transformation obtained in the Sim3 problem are s _sim ，T _sim . The pose after the GNSS information joint optimization is expressed as T _w . The construction optimization problem is as follows:

the relative pose and picture obtained using the co-view relationship will also be added as a consistency constraint for the next step to the densely reconstructed data queue.

Step 4: and (3) performing dense three-dimensional reconstruction by using the VIO pose estimation information of the single unmanned aerial vehicle and the relatively accurate relative pose and picture information among the unmanned aerial vehicles processed in the step (3) to generate a dense three-dimensional point cloud map.

Here, using dense stereo vision matching algorithm to carry out fast stereo matching on adjacent frames and frames at the same moment with overlapping rate within a certain range to obtain depth map; filtering the obtained depth map, and performing continuous image consistency check to obtain a depth map with higher precision; and after consistency check, removing redundant points in adjacent frames to generate a dense point cloud map.

The key technique in this step is dense reconstruction:

in the previous step VIO-GNSS coupled co-localization (step 1 and step 2) and the relative localization using the co-view information (step 3), a precise relative localization between the image frames has been obtained. Based on the relative positioning result, dense mapping is carried out, and the dense mapping of the single unmanned aerial vehicle is mainly divided into three parts, namely a Denser, a Cleaner and a Pruner. Denser performs reconstruction, cleaner and Pruner process the point cloud to eliminate inconsistent or redundant points. And constructing a virtual stereo pair by using the estimated local accurate pose and the front and back two frames of images, thereby generating a depth map. And recovering a large-scale dense point cloud by using the depth map. To improve the efficiency of matching, a method of maximizing the common area of the left and right views is applied. In constructing the dense disparity map section, the fast stereo matching classical method ELAS is used.

In the above method, to further reduce errors in dense mapping, the consistency constraint between different frames is fully exploited. The higher the consistency, the higher the similarity between pixels, and the more accurate the restored depth map. In the consistency detection section, a producer consumer model is constructed. After the images in the image queue reach a certain number, converting the restored depth map into three-dimensional points and re-projecting the three-dimensional points onto other frames of images, and if the residual error is smaller than a certain threshold value, considering the three-dimensional points to be consistent in other views and accepting the three-dimensional points. The unacceptable three-dimensional points will be removed. This step effectively removes outliers but does not remove duplicate points. In order to further remove redundant points to save memory consumption, the received pixel points are re-projected to other depth maps, and the repeated points are removed. After obtaining the dense point cloud with better quality, the dense point cloud is transferred to a point cloud fusion module.

Step 5: after each unmanned aerial vehicle obtains the dense point cloud map through step 4, according to each unmanned aerial vehicle's relative pose, change the dense point cloud that each unmanned aerial vehicle generated under same coordinate system, here convert under the heart unmanned aerial vehicle coordinate system. For one point in the point cloud

The conversion to the central unmanned plane coordinate system is +.>

The key technology in the step is point cloud fusion:

and in the point cloud fusion part, a dense three-dimensional point cloud map constructed by each unmanned aerial vehicle and real-time relative pose among the unmanned aerial vehicles are respectively received, and in order to ensure the real-time performance of dense reconstruction, the point cloud fusion is carried out by depending on the relative positioning among the unmanned aerial vehicles. For any point of dense point cloud reconstructed by kth unmanned aerial vehicle

The fused dense point cloud map can generate a layering phenomenon due to the problems of overlapping view angles and the like, so that the memory consumption and the calculation complexity of point cloud processing are greatly increased. In order to eliminate the lamination, the voxel method is adopted to downsample the point cloud map, so that the scale of the point cloud can be restrained to a certain extent.

The method provided by the invention is suitable for a large-scale scene, and in order to facilitate writing in the point cloud of the 'inactive' part to achieve the purpose of further saving the memory, the positioning information is utilized to carry out two-dimensional point cloud position segmentation, and the point clouds are respectively stored in different point cloud blocks. The index of the point cloud block is found by means of the following equation:

x＝P _x /d _cell

y＝P _y /d _cell

wherein P is _x ,P _x Is the position of the point cloud from its coordinates to the x, y plane under the central coordinate system. d, d _cell Is the width of the point cloud. By storing the point cloud information in the mode, the memory can be saved, and the scale of the point cloud can be effectively limited in the operations such as back end point cloud voxelization and the like.

Step 6: and (3) utilizing a hierarchical point cloud fusion registration algorithm to perform point cloud registration on the dense point cloud maps constructed by different unmanned aerial vehicles in each map building subgroup. After registering the point cloud maps in the subgroups, the relative pose of the drones in each subgroup is updated. And then carrying out fusion registration on the point cloud map constructed and fused by each sub-group, and updating the relative pose of the central unmanned aerial vehicle among different sub-groups.

The hierarchical point cloud fusion registration algorithm firstly determines the gravity direction of the point cloud, and then blocks the point cloud in the gravity direction of the point cloud. For each point cloud, each point is projected onto the same plane perpendicular to the gravity direction, firstly, rough registration is carried out by utilizing line-plane characteristics of two-dimensional compression point clouds of different layers, and clustering is carried out on registered transformation results to obtain the most probable transformation results. And performing coarse transformation on the two dense point clouds by using the result, and then further performing fine registration on the dense point clouds by using an ICP algorithm in combination with three-dimensional point information of corresponding feature points of the image. And after fusing the point clouds in the subgroups, updating the relative pose between unmanned aerial vehicles.

The registration between the point clouds generated by the subgroups is similar to the above process, except that in order to reduce the computational effort consumption, the point clouds are downsampled, and the downsampled point clouds are registered.

The key technology in the step is point cloud registration:

after point cloud fusion, a finer point cloud registration is required. The layered point cloud registration method provided by the invention is characterized in that the point cloud is layered in the gravity vertical direction and then compressed into the corresponding plane along the gravity direction. And registering the two-dimensional point cloud images by using a line-surface characteristic method, so that a better rough registration result can be obtained. And (3) directly utilizing a Hough straight line detection algorithm to obtain the line characteristics of the two-dimensional point cloud image. For the surface feature, curvature information of points is used for acquisition. A finer registration is then performed using an improved ICP method based on this registration result.

There are two methods for determining the direction of gravity in the above steps. In the case of IMU information, the direction of the IMU static measurements may be taken as the direction of gravity. In the absence of IMU information, the determination of the direction of gravity would be difficult. The direction of gravity is obtained at this time depending on the GNSS information. The specific method is that GNSS information is firstly converted into displacement under an ENU coordinate system. And transforming the Z axis (gravity direction) under the GNSS coordinate system into the camera coordinate system by utilizing the relation between the ENU coordinate system and the camera coordinate system to obtain the gravity direction. Finding the direction of gravity can be difficult when the IMU and GNSS information is missing or unreliable.

The invention provides a 'pseudo gravity' direction calculation only depending on point cloud information. First, curvature calculation is performed on two groups of point clouds, and the normal direction of the direction with the largest surface characteristics is selected as the gravity direction. The above steps are performed after the gravitational direction is obtained.

In the improved ICP method, the main improvement point is to give a larger weight to the three-dimensional points recovered by the matched image features. The point pairs obtained by using the image feature matching algorithm are generally more reliable than the nearest neighbor points found by the traditional ICP method.

Step 7: after the accurate registration is completed, the relative pose of each unmanned aerial vehicle is updated by using registration information, and whether the sub-group needs to be divided again is judged by using the relative pose information: the method mainly comprises the steps of judging whether acquired pictures have a common view relationship according to relative pose to judge whether sub-groups need to be divided again so as to enable unmanned aerial vehicles in the sub-groups to obtain an optimal common view relationship. If the sub-group needs to be re-divided, return to step 1 and re-divide and initialize the dense reconstruction sub-group with these relative poses.

Although embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives, and variations may be made in the above embodiments by those skilled in the art without departing from the spirit and principles of the invention.

Claims

1. A method for real-time dense reconstruction of a cluster unmanned aerial vehicle is characterized by comprising the following steps of: the method comprises the following steps:

2. The method for real-time dense reconstruction of a clustered unmanned aerial vehicle of claim 1, wherein: the process of initializing the VIO and performing autonomous positioning in the step 1 is as follows: firstly, aligning an initial coordinate system of the VIO with gravity by using IMU information on the unmanned aerial vehicle, and solving the posture of the unmanned aerial vehicle by using the transformation between the initial static IMU information and a gravity vector; then, forming multi-frame image information into a sliding window for local batch optimization, making a SfM with pure vision in the sliding window, recovering a camera track without scale, and obtaining an IMU track by using IMU information integration; finally, the camera track and the IMU track are aligned.

3. The method for real-time dense reconstruction of a clustered unmanned aerial vehicle of claim 1, wherein: the specific process of the step 2 is as follows: firstly, acquiring VIO local pose and GNSS information of an unmanned aerial vehicle, and aligning time stamps of the VIO local pose and the GNSS information; then, constructing a global optimization problem by utilizing the VIO local pose and GNSS information, so that the residual error between the VIO local pose and the GNSS information constructed by the VIO local pose and the GNSS information is minimum, wherein variables to be optimized comprise the global pose of the unmanned aerial vehicle and the global scale of the VIO estimated pose; then solving an optimization problem and obtaining transformation from a local coordinate system to a global coordinate system through calculation; and converting the VIO estimation pose after the scale recovery into the camera pose by using the external parameter matrix, and performing time stamp alignment on the camera pose and the received picture.

4. The method for real-time dense reconstruction of a clustered unmanned aerial vehicle of claim 1, wherein: the specific method for calculating the overlapping rate of the images acquired by the two unmanned aerial vehicles at the same moment in the step 3 is that a homography matrix is calculated through matching points of the images, so that transformation between the two cameras is recovered, then the vertex of one image is transformed to the other image, and the proportion of the area of the polygon intersection to the image is calculated.

5. A method for dense reconstruction of a clustered unmanned aerial vehicle in real time according to claim 1 or 4, wherein: in step 3, comparing the overlapping rate of the images acquired by the unmanned aerial vehicles in the same subgroup, and if the overlapping rate meets the set threshold requirement, further optimizing the relative pose between the cameras by utilizing the common view information: constructing a Sim3 problem, and optimizing the relative pose obtained in the step 2; and the transformation and the scale between the two coordinate systems are recovered by utilizing the coordinates of the three pairs of matched points under the two coordinate systems.

6. The method for real-time dense reconstruction of a clustered unmanned aerial vehicle of claim 1, wherein: in the step 4, a dense stereoscopic vision matching algorithm is used for fast stereoscopic matching of adjacent frames and frames with the overlapping rate within a set range at the same moment, and a depth map is obtained; filtering the obtained depth map, and performing continuous image consistency check to obtain a depth map with higher precision; and after consistency check, removing redundant points in adjacent frames to generate a dense point cloud map.

7. The method for real-time dense reconstruction of a clustered unmanned aerial vehicle of claim 1, wherein: in step 5, a dense three-dimensional point cloud map constructed by each unmanned aerial vehicle and real-time relative pose among unmanned aerial vehicles are respectively received, and point cloud fusion is carried out depending on relative positioning among the unmanned aerial vehicles:

for any point of dense point cloud reconstructed by kth unmanned aerial vehicle

8. The method for real-time dense reconstruction of a clustered unmanned aerial vehicle of claim 7, wherein: in step 5, two-dimensional point cloud position segmentation is carried out by utilizing positioning information, and point clouds are respectively stored in different point cloud blocks; the index of the point cloud block is found by means of the following equation:

x＝P _x /d _cell

y＝P _y /d _cell

wherein P is _x ，P _x Is the coordinates of the point cloud toA position on an x, y plane under a central coordinate system; d, d _cell Is the width of the point cloud.

9. The method for real-time dense reconstruction of a clustered unmanned aerial vehicle of claim 1, wherein: in step 6, the hierarchical point cloud fusion registration algorithm is as follows: firstly, determining the gravity direction of the point cloud, and then partitioning the point cloud in the gravity direction of the point cloud; for each point cloud, each point is projected onto the same plane vertical to the gravity direction, firstly, rough registration is carried out by utilizing line-plane characteristics of two-dimensional compression point clouds of different layers, and clustering is carried out on the registered transformation results to obtain the most probable transformation results; and performing coarse transformation on the two dense point clouds by using the result, and then further performing fine registration on the dense point clouds by using an ICP algorithm in combination with three-dimensional point information of corresponding feature points of the image.

10. The method for real-time dense reconstruction of a clustered unmanned aerial vehicle of claim 9, wherein: in step 6, when the IMU and GNSS information are missing or the information obtained by the IMU and GNSS is unreliable, the gravity direction calculation is performed by means of the point cloud information: first, curvature calculation is performed on two groups of point clouds, and the normal direction of the direction with the largest surface characteristics is selected as the gravity direction.