CN112509061A

CN112509061A - Multi-camera visual positioning method, system, electronic device and medium

Info

Publication number: CN112509061A
Application number: CN202011464163.7A
Authority: CN
Inventors: 张雁鹏
Original assignee: Jinan Inspur Hi Tech Investment and Development Co Ltd
Current assignee: Shandong Inspur Science Research Institute Co Ltd
Priority date: 2020-12-14
Filing date: 2020-12-14
Publication date: 2021-03-16
Anticipated expiration: 2040-12-14
Also published as: CN112509061B

Abstract

The invention discloses a multi-camera visual positioning method, an electronic device and a medium, belongs to the technical field of visual positioning, and aims to solve the technical problem of how to use multiple cameras for parallel processing and still perform visual positioning under the condition of partial region shielding. The method comprises the following steps: based on the ORB _ SLAM2 algorithm, feature point identification, tracking and pose calculation are carried out according to the monocular and the binocular simultaneously, and related data are stored into corresponding data linked lists according to the monocular and the binocular respectively. An electronic device, comprising: at least one memory and at least one processor; at least one memory for storing a machine readable program; at least one processor for invoking the machine readable program to perform the above method. The computer readable medium has stored thereon computer instructions which, when executed by the processor, cause the processor to perform the above-described method.

Description

Multi-camera visual positioning method, system, electronic device and medium

Technical Field

The invention relates to the technical field of visual positioning, in particular to a multi-camera visual positioning method, an electronic device and a computer readable medium.

Background

The current visual SLAM technology is becoming mature represented by ORB _ SLAM. Because the camera has natural advantages of cost, technical foundation and the like compared with laser radar, millimeter wave radar and the like, the visual SLAM technology is widely applied to automatic driving and robot positioning.

In the currently popular vision SLAM system, processing is performed for one of three sensors, a monocular camera, a binocular camera, or a depth camera. As shown in fig. 1, in the actual use process, when the sensor is blocked or a large area of the same color appears in the field of view and no obvious feature distribution exists, the positioning failure is easily caused and the pose cannot be correctly returned.

Based on the above, how to use the multi-camera parallel processing is a technical problem to be solved, and the visual positioning can still be performed under the condition that a partial region is blocked.

Disclosure of Invention

The technical task of the present invention is to provide a multi-camera visual positioning method, an electronic device and a computer readable medium to solve the problem of how to use multi-camera parallel processing and still perform visual positioning under the condition of partial region occlusion.

In a first aspect, the present invention provides a multi-camera visual positioning method, based on an ORB _ SLAM2 algorithm, performing feature point identification, tracking and pose calculation according to monocular and binocular images, and storing related data into corresponding data linked lists according to the monocular and binocular images, respectively, the method including the steps of:

for monocular images and binocular images, extracting key frame data based on an ORB feature point identification algorithm and a tracking algorithm, and acquiring key point data through a relationship pointer of the key frame and the key point;

respectively storing the key frame data and the key point data into corresponding data linked lists according to the monocular and binocular modes;

based on the position relation between the camera and a preset point in a coordinate map, respectively carrying out coordinate conversion on the key frame data according to a monocular mode and a binocular mode to obtain corresponding position data, and storing the position data into a corresponding data linked list;

performing reverse check on the key frame data after the coordinate transformation to obtain key point data;

and respectively merging the point cloud data according to the monocular and the binocular, and storing the merged point cloud data into a corresponding data linked list.

Preferably, before the position data is stored in the corresponding data linked list, the position data is filtered to obtain stable position data, and the stable position data is stored in the corresponding data linked list.

Preferably, the position data is filtered by a kalman filter.

Preferably, storing the key frame data and the key point data into corresponding data linked lists according to a monocular and a binocular mode respectively, wherein the data linked lists are as follows:

and for each binocular image, respectively storing the binocular key frame data and the binocular key point data into a corresponding binocular key frame data linked list and a corresponding binocular key point data linked list.

In a second aspect, the present invention provides a multi-camera visual positioning system, configured to perform visual positioning by using the multi-camera visual positioning method according to any one of the first aspects, where the system includes:

the key point extraction module is used for extracting key frame data based on an ORB characteristic point identification algorithm and a tracking algorithm for the monocular image and the binocular image, and obtaining key point data through a relationship pointer of the key frame and the key point;

the data linked list module is used for respectively storing the key frame data and the key point data into corresponding data linked lists according to a monocular function and a binocular function;

the coordinate conversion module is used for respectively carrying out coordinate conversion on the key frame data according to monocular and binocular on the basis of the position relation between the camera and a preset point in a coordinate map to obtain corresponding position data, and storing the position data into a corresponding data linked list;

the back check module is used for back checking the key frame data after coordinate transformation to obtain key point data;

and the merging module is used for respectively merging the point cloud data according to the monocular and binocular images and storing the merged point cloud data into a corresponding data linked list.

Preferably, the coordinate conversion module is configured to filter the position data to obtain stable position data before storing the position data in the corresponding data linked list, and store the stable position data in the corresponding data linked list.

Preferably, the coordinate transformation module is configured to filter the position data through a kalman filter.

In a third aspect, the present invention provides an electronic device, comprising: at least one memory and at least one processor;

the at least one memory to store a machine readable program;

the at least one processor is configured to invoke the machine-readable program to perform the method of any of the first aspects.

In a fourth aspect, the present invention provides a medium, being a computer readable medium, having stored thereon computer instructions, which, when executed by a processor, cause the processor to perform the method of any of the first aspects.

The multi-camera visual positioning method, the electronic device and the computer readable medium have the following advantages:

1. based on an ORB _ SLAM2 algorithm, feature point identification, tracking and pose calculation are carried out simultaneously according to a monocular and a binocular, and related data are respectively stored into corresponding data linked lists according to the monocular and the binocular, so that the effective area of available data is expanded, and in a multi-camera scene, positioning failure caused by occlusion or a large-area featureless area in a view field can be effectively avoided, so that normal work can still be carried out under the condition of partial occlusion, and the indoor and outdoor environment adaptability is greatly improved;

2. meanwhile, multipath monocular and binocular data are processed, so that pose data can be obtained more accurately;

3. the distance information of the monocular data is obtained based on camera parameter estimation, compared with the distance result obtained by binocular parallax calculation of the binocular data, the error is larger, and the monocular data and the binocular data are stored separately, so that the influence of the monocular data on the accurate binocular data is avoided.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

The invention is further described below with reference to the accompanying drawings.

FIG. 1 is a schematic diagram of an effective area of an ORB _ SLAM binocular camera;

FIG. 2 is a block flow diagram of a multi-camera visual positioning method according to embodiment 1;

fig. 3 is a schematic view of an effective area of a binocular camera in the multi-camera visual positioning method of embodiment 1;

fig. 4 is a schematic diagram of a data processing manner of six cameras in the multi-camera visual positioning method in embodiment 1.

Detailed Description

The present invention is further described in the following with reference to the drawings and the specific embodiments so that those skilled in the art can better understand the present invention and can implement the present invention, but the embodiments are not to be construed as limiting the present invention, and the embodiments and the technical features of the embodiments can be combined with each other without conflict.

It is to be understood that "a plurality" in the embodiments of the present invention means two or more.

The embodiment of the invention provides a multi-camera visual positioning method, an electronic device and a computer readable medium, which are used for solving the technical problem of how to use multi-camera parallel processing and still perform visual positioning under the condition of partial region shielding.

Example 1:

the multi-camera visual positioning method is based on an ORB _ SLAM2 algorithm, and simultaneously performs feature point identification, tracking and pose calculation according to monocular and binocular images, and stores related data into corresponding data linked lists according to the monocular and binocular images.

As shown in fig. 2, the method comprises the steps of:

s100, extracting key frame data based on an ORB characteristic point recognition algorithm and a tracking algorithm for the monocular image and the binocular image, and acquiring key point data through a relationship pointer of the key frame and the key point;

s200, respectively storing the key frame data and the key point data into corresponding data linked lists according to the monocular and binocular modes;

s300, respectively carrying out coordinate conversion on key frame data according to a monocular and a binocular on the basis of the position relation between the camera and a preset point in a coordinate map to obtain corresponding position data, and storing the position data into a corresponding data linked list;

s400, performing reverse check on the key frame data after the coordinate transformation to obtain key point data;

and S500, respectively merging the point cloud data according to the monocular and binocular images, and storing the merged point cloud data into a corresponding data linked list.

In this embodiment, the key frame data and the key point data are respectively stored in the corresponding data linked lists according to the monocular and binocular modes, which is: and for each binocular image, respectively storing the binocular key frame data and the binocular key point data into a corresponding binocular key frame data linked list and a corresponding binocular key point data linked list.

The distance information of the monocular data is obtained through camera parameter estimation, compared with the distance result obtained through binocular parallax calculation of the binocular data, the error is larger, and the influence of the monocular data on the accurate binocular data is avoided through separate storage.

As shown in fig. 3 and 4, in this embodiment, by using a method of simultaneously processing monocular and binocular data, an effective area of available data is expanded, and in a multi-camera scene, a positioning failure caused by a large area of an uncharacterized area occurring in a blocked or viewing field can be effectively avoided.

As an improvement of this embodiment, before the position data is stored in the corresponding data linked list, the position data is filtered to obtain stable position data, and the stable position data is stored in the corresponding data linked list. In this embodiment, the position data is filtered by a kalman filter.

Example 2:

the multi-camera visual positioning system is used for carrying out visual positioning through the multi-camera visual positioning method disclosed by the embodiment.

The system comprises a key point extraction module, a storage module, a coordinate conversion module, a back check module and a merging module. The key point extraction module is used for extracting key frame data based on an ORB characteristic point identification algorithm and a tracking algorithm for the monocular image and the binocular image, and obtaining key point data through a relationship pointer of the key frame and the key point; the storage module is used for respectively storing the key frame data and the key point data into corresponding data linked lists according to the monocular and binocular modes; the coordinate conversion module is used for respectively carrying out coordinate conversion on the key frame data according to a monocular and a binocular on the basis of the position relation between the camera and a preset point in a coordinate map to obtain corresponding position data, and storing the position data into a corresponding data linked list; the back-check module is used for back-checking the key frame data after coordinate transformation to obtain key point data; the merging module is used for respectively merging the point cloud data according to the monocular and the binocular and storing the merged point cloud data into a corresponding data linked list.

Wherein, storing the key frame data and the key point data into the corresponding data linked list according to the monocular and binocular respectively, which is: and for each binocular image, respectively storing the binocular key frame data and the binocular key point data into a corresponding binocular key frame data linked list and a corresponding binocular key point data linked list.

Before the position data are stored in the corresponding data linked lists, the position data are filtered to obtain stable position data, and the stable position data are stored in the corresponding data linked lists. In the system, position data are filtered through a Kalman filter.

By means of the method for simultaneously processing the monocular and binocular data, the effective area of the available data is expanded, and positioning failure caused by large-area featureless areas in sheltering or view fields can be effectively avoided in a multi-camera scene.

Example 3:

the present invention provides an electronic device, comprising: at least one memory and at least one processor; the at least one memory for storing a machine-readable program; the at least one processor is used for calling the machine readable program and executing the method disclosed by the embodiment 1.

Example 4:

a computer-readable medium of the present invention has computer instructions stored thereon, and when the computer instructions are executed by a processor, the processor is caused to execute the method disclosed in embodiment 1 of the present invention. Specifically, a system or an apparatus equipped with a storage medium on which software program codes that realize the functions of any of the above-described embodiments are stored may be provided, and a computer (or a CPU or MPU) of the system or the apparatus is caused to read out and execute the program codes stored in the storage medium.

In this case, the program code itself read from the storage medium can realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code constitute a part of the present invention.

Examples of the storage medium for supplying the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD + RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer via a communications network.

Further, it should be clear that the functions of any one of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform a part or all of the actual operations based on instructions of the program code.

Further, it is to be understood that the program code read out from the storage medium is written to a memory provided in an expansion board inserted into the computer or to a memory provided in an expansion unit connected to the computer, and then causes a CPU or the like mounted on the expansion board or the expansion unit to perform part or all of the actual operations based on instructions of the program code, thereby realizing the functions of any of the above-described embodiments.

It should be noted that not all steps and modules in the above flows and system structure diagrams are necessary, and some steps or modules may be omitted according to actual needs. The execution order of the steps is not fixed and can be adjusted as required. The system structure described in the above embodiments may be a physical structure or a logical structure, that is, some modules may be implemented by the same physical entity, or some modules may be implemented by a plurality of physical entities, or some components in a plurality of independent devices may be implemented together.

In the above embodiments, the hardware unit may be implemented mechanically or electrically. For example, a hardware element may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware elements may also comprise programmable logic or circuitry, such as a general purpose processor or other programmable processor, that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.

While the invention has been shown and described in detail in the drawings and in the preferred embodiments, it is not intended to limit the invention to the embodiments disclosed, and it will be apparent to those skilled in the art that various combinations of the code auditing means in the various embodiments described above may be used to obtain further embodiments of the invention, which are also within the scope of the invention.

Claims

1. The multi-camera visual positioning method is characterized in that based on an ORB _ SLAM2 algorithm, feature point identification, tracking and pose calculation are carried out according to a monocular and a binocular simultaneously, and related data are stored in corresponding data linked lists according to the monocular and the binocular respectively, and the method comprises the following steps:

2. The multi-camera visual positioning method of claim 1, wherein the position data is filtered to obtain stable position data before being stored in the corresponding data link list, and the stable position data is stored in the corresponding data link list.

3. The multi-camera visual positioning method of claim 2, characterized in that the position data is filtered by a kalman filter.

4. The multi-camera visual positioning method according to claim 1, 2 or 3, characterized in that the key frame data and the key point data are respectively stored into corresponding data linked lists according to monocular and binocular, as follows:

5. A multi-camera visual positioning system for performing visual positioning by the multi-camera visual positioning method according to any one of claims 1 to 4, the system comprising:

6. The multi-camera visual positioning system of claim 5, wherein the coordinate transformation module is configured to filter the position data to obtain stable position data before storing the position data in the corresponding data link table, and store the stable position data in the corresponding data link table.

7. The multi-camera visual positioning system of claim 6, wherein the coordinate transformation module is configured to filter the position data through a Kalman filter.

8. The multi-camera visual positioning system of claim 5, 6 or 7, wherein the merging module is configured to store the keyframe data and the keypoint data into corresponding data linked lists according to monocular and binocular modes, respectively, as follows:

9. An electronic device, comprising: at least one memory and at least one processor;

the at least one memory to store a machine readable program;

the at least one processor, configured to invoke the machine readable program to perform the method of any of claims 1 to 4.

10. A medium being a computer readable medium having stored thereon computer instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1 to 4.