CN112509061B

CN112509061B - Multi-camera visual positioning method, system, electronic device and medium

Info

Publication number: CN112509061B
Application number: CN202011464163.7A
Authority: CN
Inventors: 张雁鹏
Original assignee: Shandong Inspur Science Research Institute Co Ltd
Current assignee: Shandong Inspur Science Research Institute Co Ltd
Priority date: 2020-12-14
Filing date: 2020-12-14
Publication date: 2024-03-22
Anticipated expiration: 2040-12-14
Also published as: CN112509061A

Abstract

The invention discloses a multi-camera visual positioning method, an electronic device and a medium, belongs to the technical field of visual positioning, and aims to solve the technical problem of how to use multi-camera parallel processing and still perform visual positioning under the condition of shielding a partial area. The method comprises the following steps: and simultaneously carrying out feature point identification, tracking and pose calculation according to monocular and binocular based on an ORB_SLAM2 algorithm, and storing related data into corresponding data linked lists according to monocular and binocular respectively. An electronic device, comprising: at least one memory and at least one processor; at least one memory for storing a machine readable program; at least one processor for invoking the machine readable program to perform the above described method. The computer readable medium has stored thereon computer instructions which, when executed by a processor, cause the processor to perform the above-described method.

Description

Multi-camera visual positioning method, system, electronic device and medium

Technical Field

The invention relates to the technical field of visual positioning, in particular to a multi-camera visual positioning method, an electronic device and a computer readable medium.

Background

Current visual SLAM technology, represented by orb_slam, has matured day by day. Because the camera has the advantages of natural cost, technical foundation and the like compared with laser radars, millimeter wave radars and the like, the vision SLAM technology is widely applied to automatic driving and robot positioning.

In the currently popular visual SLAM system, processing is performed for one of three sensors, a monocular camera, a binocular camera, or a depth camera. As shown in FIG. 1, in the actual use process, the sensor is blocked or has a large area of the same color in the field of view and has no obvious characteristic distribution, so that positioning failure is very easy to cause, and the sensor cannot return to the position correctly.

Based on the above, how to use multiple cameras for parallel processing can still perform visual positioning under the condition of shielding a partial area is a technical problem to be solved.

Disclosure of Invention

The technical task of the invention is to provide a multi-camera visual positioning method, an electronic device and a computer readable medium for solving the problem of how to use multi-camera parallel processing and still perform visual positioning under the condition of partial area shielding.

In a first aspect, the present invention provides a multi-camera visual positioning method, based on an orb_slam2 algorithm, for performing feature point recognition, tracking, and pose calculation according to monocular and binocular, and storing related data into corresponding data linked lists according to monocular and binocular, respectively, the method comprising the steps of:

for a monocular image and a binocular image, extracting key frame data based on an ORB characteristic point recognition algorithm and a tracking algorithm, and acquiring key point data through a relation pointer of a key frame and a key point;

respectively storing the key frame data and the key point data into corresponding data linked lists according to monocular and binocular;

based on the position relation between the camera and preset points in the coordinate map, carrying out coordinate conversion on the key frame data according to monocular and binocular respectively to obtain corresponding position data, and storing the position data into a corresponding data link list;

performing inverse checking on the key frame data after the coordinate transformation to obtain key point data;

and respectively merging the point cloud data according to the monocular and the binocular, and storing the merged point cloud data into a corresponding data linked list.

Preferably, before the position data is stored in the corresponding data link list, the position data is filtered to obtain stable position data, and the stable position data is stored in the corresponding data link list.

Preferably, the position data is filtered by a kalman filter.

Preferably, the key frame data and the key point data are stored in corresponding data linked lists according to monocular and binocular respectively, and the method comprises the following steps:

for each monocular image, monocular keyframe data and monocular keypoint data are respectively stored into a corresponding monocular keyframe data linked list and monocular keypoint data linked list, and for each binocular image, binocular keyframe data and binocular keypoint data are respectively stored into a corresponding binocular keyframe data linked list and binocular keypoint data linked list.

In a second aspect, the present invention provides a multi-camera visual positioning system for performing visual positioning by the multi-camera visual positioning method according to any one of the first aspect, the system comprising:

the key point extraction module is used for extracting key frame data based on an ORB characteristic point recognition algorithm and a tracking algorithm for a monocular image and a binocular image, and acquiring the key point data through a relation pointer of the key frame and the key point;

the data link list module is used for storing the key frame data and the key point data into the corresponding data link list according to the monocular and the binocular respectively;

the coordinate conversion module is used for carrying out coordinate conversion on the key frame data according to monocular and binocular respectively based on the position relation between the camera and preset points in the coordinate map to obtain corresponding position data, and storing the position data into a corresponding data link list;

the reverse checking module is used for carrying out reverse checking on the key frame data after the coordinate transformation to obtain key point data;

and the merging module is used for merging the point cloud data according to monocular and binocular respectively and storing the merged point cloud data into a corresponding data linked list.

Preferably, the coordinate conversion module is configured to filter the position data to obtain stable position data before storing the position data in the corresponding data link list, and store the stable position data in the corresponding data link list.

Preferably, the coordinate conversion module is configured to filter the position data by a kalman filter.

In a third aspect, the present invention provides an electronic device, comprising: at least one memory and at least one processor;

the at least one memory for storing a machine readable program;

the at least one processor is configured to invoke the machine-readable program to perform the method of any of the first aspects.

In a fourth aspect, the present invention provides a medium, a computer readable medium having stored thereon computer instructions which, when executed by a processor, cause the processor to perform the method of any of the first aspects.

The multi-camera visual positioning method, the electronic device and the computer readable medium have the following advantages:

1. based on ORB_SLAM2 algorithm, feature point recognition, tracking and pose calculation are carried out according to monocular and binocular, relevant data are stored into corresponding data linked lists according to monocular and binocular respectively, effective areas of available data are expanded, positioning failure caused by shielding or large-area non-feature areas in a view field can be effectively avoided under a multi-camera scene, normal operation can still be carried out under the condition of partial shielding, and indoor and outdoor environmental adaptability is greatly improved;

2. meanwhile, multipath monocular and binocular data are processed, so that pose data can be obtained more accurately;

3. the distance information of the monocular data is obtained based on camera parameter estimation, compared with the distance result obtained by binocular parallax calculation of the binocular data, the distance information of the monocular data has larger error, and the monocular data and the binocular data are stored separately, so that the influence of the monocular data on the accurate binocular data is avoided.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

The invention is further described below with reference to the accompanying drawings.

FIG. 1 is a schematic view of the effective area of an ORB_SLAM binocular camera;

fig. 2 is a flow chart of a multi-camera visual positioning method of embodiment 1;

fig. 3 is a schematic view of an effective area of a binocular camera in the multi-camera visual positioning method of embodiment 1;

fig. 4 is a schematic diagram of six camera data processing modes in the multi-camera visual positioning method in embodiment 1.

Detailed Description

The invention will be further described with reference to the accompanying drawings and specific examples, so that those skilled in the art can better understand the invention and implement it, but the examples are not meant to limit the invention, and the technical features of the embodiments of the invention and the examples can be combined with each other without conflict.

It should be understood that "plurality" in the embodiments of the present invention means two or more.

The embodiment of the invention provides a multi-camera visual positioning method, an electronic device and a computer readable medium, which are used for solving the technical problem of how to use multi-camera parallel processing and still perform visual positioning under the condition of shielding a partial area.

Example 1:

the invention discloses a multi-camera visual positioning method, which is based on an ORB_SLAM2 algorithm, performs feature point identification, tracking and pose calculation according to monocular and binocular, and stores related data into corresponding data linked lists according to monocular and binocular respectively.

As shown in fig. 2, the method comprises the steps of:

s100, extracting key frame data based on an ORB characteristic point recognition algorithm and a tracking algorithm for a monocular image and a binocular image, and acquiring key point data through a relation pointer of the key frame and the key point;

s200, storing the key frame data and the key point data into corresponding data linked lists according to monocular and binocular respectively;

s300, based on the position relation between the camera and preset points in the coordinate map, carrying out coordinate conversion on the key frame data according to monocular and binocular respectively to obtain corresponding position data, and storing the position data into a corresponding data linked list;

s400, performing inverse checking on key frame data after coordinate transformation to obtain key point data;

s500, merging point cloud data according to monocular and binocular respectively, and storing the merged point cloud data into a corresponding data linked list.

In this embodiment, the key frame data and the key point data are stored in the corresponding data linked lists according to the monocular and the binocular respectively, which is: for each monocular image, monocular keyframe data and monocular keypoint data are respectively stored into a corresponding monocular keyframe data linked list and monocular keypoint data linked list, and for each binocular image, binocular keyframe data and binocular keypoint data are respectively stored into a corresponding binocular keyframe data linked list and binocular keypoint data linked list.

The distance information of the monocular data is obtained based on camera parameter estimation, compared with the distance result obtained by binocular parallax calculation of the binocular data, the distance information of the monocular data has larger error, and the influence of the monocular data on the accurate binocular data is avoided by separate storage.

As shown in fig. 3 and fig. 4, in this embodiment, by a method of processing monocular and binocular data simultaneously, an effective area of available data is expanded, and positioning failure caused by shielding or a large-area non-feature area in a field of view can be effectively avoided in a multi-camera scene.

As an improvement of this embodiment, before storing the position data in the corresponding data link list, the position data is filtered to obtain stable position data, and the stable position data is stored in the corresponding data link list. In this embodiment, the position data is filtered by a Kalman filter.

Example 2:

the multi-camera visual positioning system is used for performing visual positioning through the multi-camera visual positioning method disclosed by the embodiment.

The system comprises a key point extraction module, a storage module, a coordinate conversion module, a reverse checking module and a merging module. The key point extraction module is used for extracting key frame data based on an ORB characteristic point recognition algorithm and a tracking algorithm for a monocular image and a binocular image, and acquiring key point data through a relation pointer of the key frame and the key point; the storage module is used for storing the key frame data and the key point data into the corresponding data linked list according to the monocular and the binocular respectively; the coordinate conversion module is used for carrying out coordinate conversion on the key frame data according to monocular and binocular respectively based on the position relation between the camera and a preset point in the coordinate map to obtain corresponding position data, and storing the position data into a corresponding data linked list; the reverse checking module is used for carrying out reverse checking on the key frame data after the coordinate transformation to obtain key point data; the merging module is used for merging the point cloud data according to monocular and binocular respectively and storing the merged point cloud data into a corresponding data linked list.

Wherein, according to monocular and binocular, store key frame data and key point data into the correspondent data link list respectively, it is: for each monocular image, monocular keyframe data and monocular keypoint data are respectively stored into a corresponding monocular keyframe data linked list and monocular keypoint data linked list, and for each binocular image, binocular keyframe data and binocular keypoint data are respectively stored into a corresponding binocular keyframe data linked list and binocular keypoint data linked list.

Filtering the position data to obtain stable position data before storing the position data in the corresponding data linked list, and storing the stable position data in the corresponding data linked list. The position data is filtered by a kalman filter in the system.

By the method for processing monocular and binocular data simultaneously, the effective area of available data is expanded, and positioning failure caused by shielding or large-area non-characteristic areas in a view field can be effectively avoided in a multi-camera scene.

Example 3:

the invention provides an electronic device, comprising: at least one memory and at least one processor; the at least one memory for storing a machine readable program; the at least one processor is configured to invoke the machine-readable program to perform the method disclosed in embodiment 1.

Example 4:

a computer readable medium of the present invention has stored thereon computer instructions which, when executed by a processor, cause the processor to perform the method disclosed in embodiment 1 of the present invention. Specifically, a system or apparatus provided with a storage medium on which a software program code realizing the functions of any of the above embodiments is stored, and a computer (or CPU or MPU) of the system or apparatus may be caused to read out and execute the program code stored in the storage medium.

In this case, the program code itself read from the storage medium may realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code form part of the present invention.

Examples of the storage medium for providing the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer by a communication network.

Further, it should be apparent that the functions of any of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform part or all of the actual operations based on the instructions of the program code.

Further, it is understood that the program code read out by the storage medium is written into a memory provided in an expansion board inserted into a computer or into a memory provided in an expansion unit connected to the computer, and then a CPU or the like mounted on the expansion board or the expansion unit is caused to perform part and all of actual operations based on instructions of the program code, thereby realizing the functions of any of the above embodiments.

It should be noted that not all the steps and modules in the above flowcharts and the system configuration diagrams are necessary, and some steps or modules may be omitted according to actual needs. The execution sequence of the steps is not fixed and can be adjusted as required. The system structure described in the above embodiments may be a physical structure or a logical structure, that is, some modules may be implemented by the same physical entity, or some modules may be implemented by multiple physical entities, or may be implemented jointly by some components in multiple independent devices.

In the above embodiments, the hardware unit may be mechanically or electrically implemented. For example, a hardware unit may include permanently dedicated circuitry or logic (e.g., a dedicated processor, FPGA, or ASIC) to perform the corresponding operations. The hardware unit may also include programmable logic or circuitry (e.g., a general-purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The particular implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.

While the invention has been illustrated and described in detail in the drawings and in the preferred embodiments, the invention is not limited to the disclosed embodiments, and it will be appreciated by those skilled in the art that the code audits of the various embodiments described above may be combined to produce further embodiments of the invention, which are also within the scope of the invention.

Claims

1. The multi-camera visual positioning method is characterized by comprising the following steps of carrying out feature point identification, tracking and pose calculation according to monocular and binocular simultaneously based on an ORB_SLAM2 algorithm, and storing relevant data into corresponding data linked lists according to monocular and binocular respectively, wherein the method comprises the following steps:

for each single-mesh image, storing the single-mesh key frame data and the single-mesh key point data into a corresponding single-mesh key frame data linked list and a corresponding single-mesh key point data linked list respectively, and for each double-mesh image, storing the double-mesh key frame data and the double-mesh key point data into a corresponding double-mesh key frame data linked list and a corresponding double-mesh key point data linked list respectively;

based on the position relation between the camera and preset points in the coordinate map, carrying out coordinate conversion on the key frame data according to monocular and binocular respectively to obtain corresponding position data, filtering the position data through a Kalman filter to obtain stable position data, and storing the stable position data into a corresponding data link table;

2. A multi-camera visual positioning system for visual positioning by a multi-camera visual positioning method as claimed in claim 1, the system comprising:

the data link table module is used for storing the monocular key frame data and the monocular key point data into a corresponding monocular key frame data link table and a corresponding monocular key point data link table respectively for each monocular image, and is used for storing the binocular key frame data and the binocular key point data into a corresponding binocular key frame data link table and a corresponding binocular key point data link table respectively for each binocular image;

the coordinate conversion module is used for carrying out coordinate conversion on the key frame data according to monocular and binocular respectively based on the position relation between the camera and preset points in the coordinate map to obtain corresponding position data, filtering the position data through a Kalman filter to obtain stable position data, and storing the stable position data into a corresponding data linked list;

3. An electronic device, comprising: at least one memory and at least one processor;

the at least one memory for storing a machine readable program;

the at least one processor configured to invoke the machine readable program to perform the method of claim 1.

4. A medium, a computer readable medium, having stored thereon computer instructions which, when executed by a processor, cause the processor to perform the method of claim 1.