CN113784026A

CN113784026A - Method, apparatus, device and storage medium for calculating position information based on image

Info

Publication number: CN113784026A
Application number: CN202111003368.XA
Authority: CN
Inventors: 陈俊宏; 魏恒嘉; 何震宇; 杨旸
Original assignee: Shenzhen Graduate School Harbin Institute of Technology; Peng Cheng Laboratory
Current assignee: Shenzhen Graduate School Harbin Institute of Technology; Peng Cheng Laboratory
Priority date: 2021-08-30
Filing date: 2021-08-30
Publication date: 2021-12-10
Anticipated expiration: 2041-08-30
Also published as: CN113784026B

Abstract

The present invention relates to the field of image processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for calculating position information based on an image. Acquiring a visible light image and an infrared image, wherein the visible light image and the infrared image are acquired at the same time through an image acquisition device; extracting visible characteristic points of the visible light image and infrared characteristic points corresponding to the visible characteristic points in the infrared image; and obtaining the position information of the image acquisition device according to the visible characteristic points and the infrared characteristic points. The invention comprehensively utilizes the advantages of the visible light image and the infrared image, so that the image acquisition device can acquire the characteristic points in the two images by analyzing the visible light image and the infrared image acquired by the image acquisition device even if the image acquisition device works in a dark environment with insufficient light, and further acquire the position information of the image acquisition device through the characteristic points.

Description

Method, apparatus, device and storage medium for calculating position information based on image

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for calculating position information based on an image.

Background

At present, methods for acquiring position information based on images include a feature point method, an optical flow method, a direct method and a deep learning method. The feature point method relies on extracting feature textures in the environment for tracking and matching to obtain location information. The optical flow method is a further characteristic point method, and characteristic points are extracted by means of gray level changes of image pixel points. The direct method directly optimizes the position information through the photometric error of the block, and can estimate the motion in the environment without extracting key points. Generally, obtaining position information depends on images taken in a stable and sufficient light environment, an unstable environment for darkness and light changes. The infrared image collected by the infrared camera of the image collecting device is less affected by visible light, and the infrared image can be obtained even in a dark environment. The method has good performance in the fields of military night battles, night monitoring, cave exploration and the like, but the infrared image has the defects of unclear texture, low contrast, low signal-to-noise ratio and the like. The visible light image collected by the visible light lens of the image collecting device is greatly influenced by ambient light, the quality of the image can be improved only by a sufficient illumination environment, and the texture of the visible light image is clear. From the above analysis, it is known that the infrared image and the visible light image each have disadvantages and advantages, and if only one of the images is used to calculate the positional information of the image-capturing device, the accuracy of the calculated positional information is lowered.

In summary, the accuracy of the position information calculated by the prior art is low.

Thus, there is a need for improvements and enhancements in the art.

Disclosure of Invention

In order to solve the technical problems, the invention provides a method, a device, equipment and a storage medium for calculating position information based on images, which solve the problem of low accuracy of the position information calculated in the prior art.

In order to achieve the purpose, the invention adopts the following technical scheme:

in a first aspect, the present invention provides a method for calculating position information based on an image, wherein the positioning method comprises:

acquiring a visible light image and an infrared image, wherein the visible light image and the infrared image are acquired at the same time through an image acquisition device;

extracting visible characteristic points of the visible light image and infrared characteristic points corresponding to the visible characteristic points in the infrared image;

and obtaining the position information of the image acquisition device according to the visible characteristic points and the infrared characteristic points.

In one implementation, the obtaining the position information of the image capturing device according to the visible feature points and the infrared feature points includes:

obtaining the position of the visible characteristic point in the visible light image according to the visible light image;

obtaining the position of the infrared characteristic point in the infrared image according to the infrared image;

and obtaining the position information according to the position of the visible characteristic point in the visible light image and the position of the infrared characteristic point in the infrared image.

according to the visible feature points, visible pixel point information in the region where the visible feature points are located in the visible light image is obtained;

according to the infrared characteristic points, obtaining infrared pixel point information in the area where the infrared characteristic points are located in the infrared image;

and obtaining the pose in the position information according to the visible feature points, the infrared feature points, the visible pixel point information and the infrared pixel point information.

In one implementation, the obtaining the pose in the position information according to the visible feature points, the infrared feature points, the visible pixel point information, and the infrared pixel point information includes:

acquiring depth images corresponding to the visible light image and the infrared image;

and obtaining the pose according to the visible feature points, the infrared feature points, the visible pixel point information, the infrared pixel point information and the depth image.

In an implementation manner, the obtaining, according to the infrared feature point, infrared pixel point information in an area where the infrared feature point is located in the infrared image includes:

according to the infrared region where the infrared characteristic point in the infrared image is located, visible pixel point information in a visible region corresponding to the infrared region in the visible light image is obtained, and the visible pixel point information in the visible region is recorded as infrared supplementary pixel point information;

and according to the infrared supplementary pixel point information, obtaining the infrared pixel point information in the area where the infrared characteristic point is located.

In one implementation manner, the obtaining, according to an infrared region where the infrared feature point in the infrared image is located, visible pixel point information in a visible region corresponding to the infrared region in the visible light image, and recording the visible pixel point information in the visible region as supplemental infrared pixel point information includes:

obtaining adjacent infrared pixel point pairs in the infrared region according to the infrared region where the infrared characteristic points are located;

comparing the pixel values of the adjacent pixel point pairs to obtain a comparison result;

obtaining an adjacent visible pixel point pair matched with the comparison result in the visible light image according to the visible light image;

obtaining a visible region of the adjacent visible pixel point pairs in the visible light image according to the adjacent visible pixel point pairs;

and taking the visible pixel point information in the visible area as the infrared supplementary pixel point information.

obtaining a first frame image and a second frame image of visible light in the visible light image according to the visible light image, wherein the first frame image of visible light and the second frame image of visible light are obtained at different moments through the image acquisition device;

according to the infrared image, obtaining an infrared first frame image and an infrared second frame image in the infrared image, wherein the infrared first frame image and the infrared second frame image are obtained at different moments through the image acquisition device;

according to the infrared supplementary pixel point information in the infrared pixel point information, obtaining infrared characteristic points matched with the infrared first frame image and the infrared second frame image;

obtaining visible feature points matched with the first frame image of the visible light and the second frame image of the visible light according to the visible feature points in the first frame image of the visible light, the visible feature points in the second frame image of the visible light and visible pixel point information corresponding to the visible feature points;

and obtaining the pose in the position information according to the matched infrared characteristic points and the matched visible characteristic points.

In one implementation manner, the obtaining the pose in the position information according to the matched infrared feature points and the matched visible feature points includes:

when the number corresponding to the matched infrared characteristic points is larger than an infrared set value and the number corresponding to the matched visible characteristic points is also larger than a visible set value, the pose in the position information is obtained through the matched visible characteristic points;

and when one of the number corresponding to the matched infrared characteristic points and the number corresponding to the matched visible characteristic points is smaller than a set value, obtaining the pose in the position information through the other.

obtaining a visible current frame image and a visible previous frame image in the visible image according to the visible image, wherein the time for obtaining the visible previous frame image through the image acquisition device is before the time for obtaining the visible current frame image;

according to the infrared image, obtaining an infrared current frame image and an infrared previous frame image in the infrared image, wherein the time for obtaining the infrared previous frame image through the image acquisition device is before the time for obtaining the infrared current frame image;

and optimizing the pose of the image acquisition device when acquiring the infrared current frame image and the visible current frame image according to the visible feature points and the visible pixel point information corresponding to the previous frame image of the visible light and the infrared feature points and the infrared pixel point information corresponding to the previous frame image of the infrared light, so as to obtain the optimized pose.

In one implementation, the visible light image is acquired by a visible light lens of the image acquisition device;

the infrared image is acquired through an infrared lens of the image acquisition device;

and obtaining a near-infrared image in the infrared image according to the infrared image.

In a second aspect, an embodiment of the present invention further provides an apparatus for an image-based method for calculating location information, where the apparatus includes the following components:

the image acquisition module is used for acquiring a visible light image and an infrared image, and the visible light image and the infrared image are acquired at the same time through an image acquisition device;

the characteristic point extraction module is used for extracting visible characteristic points of the visible light image and infrared characteristic points corresponding to the visible characteristic points in the infrared image;

and the position information acquisition module is used for acquiring the position information of the image acquisition device according to the visible characteristic points and the infrared characteristic points.

In a third aspect, an embodiment of the present invention further provides a terminal device, where the terminal device includes a memory, a processor, and a program stored in the memory and executable on the processor for calculating location information based on an image, and the processor implements the step of calculating location information based on an image when executing the program for calculating location information based on an image.

In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, where an image-based program for calculating location information is stored, and when the image-based positioning program is executed by a processor, the steps of the above-mentioned method for calculating location information based on an image are implemented.

Has the advantages that: the collected visible light image has a certain requirement on the illumination intensity in the environment, a clear image can be obtained only when the illumination environment meets a certain requirement, the visible light image has poor robustness on noise, and although the visible light image has the defects, the visible light image has the advantages of clear texture, high contrast and the like. The infrared light image has the defects of unclear texture, low contrast and the like, but the infrared light image is acquired without requirements on the illumination intensity in the environment, and a clear image can be acquired even in a dark environment. The invention comprehensively utilizes the advantages of the visible light image and the infrared image, so that the image acquisition device can analyze the visible light image and the infrared image acquired by the image acquisition device even if the image acquisition device works in a dark environment with insufficient illumination to acquire the characteristic points in the two images (the higher the contrast is, the clearer the texture is, the higher the quality of the acquired characteristic points is, the more beneficial to subsequent calculation), and further acquire the position information of the image acquisition device through the characteristic points, so that the image acquisition device can be further adjusted through the position information, and the image acquisition device can work better. Even if the image acquisition device works in a dark environment, the method provided by the invention can still accurately acquire the position information of the image acquisition device through the analysis.

Drawings

FIG. 1 is an overall flow chart of the present invention;

FIG. 2 is an infrared image after gamma conversion processing of the present invention;

FIG. 3 is an infrared image after the median filtering process of the present invention;

FIG. 4 is a prior art matching graph using full matching;

FIG. 5 is a prior art matching graph using distance;

FIG. 6 is a matching graph of an infrared image fused with a visible image according to the present invention;

FIG. 7 is a flowchart of the SLAM system acquiring the pose of the image capture device in an embodiment;

FIG. 8 is a flow diagram of a tracking module in a SLAM system;

fig. 9 is a flow diagram of a reference key frame model tracking module in a SLAM system.

Detailed Description

The technical scheme of the invention is clearly and completely described below by combining the embodiment and the attached drawings of the specification. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Research shows that the current methods for acquiring position information based on images include a characteristic point method, an optical flow method, a direct method and a deep learning method. The feature point method relies on extracting feature textures in the environment for tracking and matching to obtain location information. The optical flow method is a further characteristic point method, and characteristic points are extracted by means of gray level changes of image pixel points. The direct method directly optimizes the position information through the photometric error of the block, and can estimate the motion in the environment without extracting key points. Generally, obtaining position information depends on images taken in a stable and sufficient light environment, an unstable environment for darkness and light changes. The infrared image collected by the infrared camera of the image collecting device is less affected by visible light, and the infrared image can be obtained even in a dark environment. The method has good performance in the fields of military night battles, night monitoring, cave exploration and the like, but the infrared image has the defects of unclear texture, low contrast, low signal-to-noise ratio and the like. The visible light image collected by the visible light lens of the image collecting device is greatly influenced by ambient light, the quality of the image can be improved only by a sufficient illumination environment, and the texture of the visible light image is clear. From the above analysis, it is known that the infrared image and the visible light image each have disadvantages and advantages, and if only one of the images is used to calculate the positional information of the image-capturing device, the accuracy of the calculated positional information is lowered.

In order to solve the technical problems, the invention provides a method, a device, equipment and a storage medium for calculating position information based on images, which solve the problem of low accuracy of the position information calculated in the prior art. In specific implementation, a visible light image and an infrared image which are shot by an image acquisition device at the same time are obtained; extracting visible characteristic points of the visible light image and infrared characteristic points corresponding to the visible characteristic points in the infrared image; and obtaining the position information of the image acquisition device according to the visible characteristic points and the infrared characteristic points. The invention comprehensively utilizes the advantages of the visible light image and the infrared image, so that the image acquisition device can also analyze the visible light image and the infrared image acquired by the image acquisition device to acquire the characteristic points in the two images even if the image acquisition device works in a dark environment with insufficient illumination, and further acquire the position information of the image acquisition device through the characteristic points, so that the image acquisition device can be further adjusted through the position information, and the image acquisition device can better work.

For example, the image capturing device moves during the capturing process, and the position information of the image capturing device changes during the moving process, and the position information may include changed displacement information and pose information. In the embodiment, the image capturing device acquires the visible light image a1 and the infrared image b1 of two frames of images at the same time t1, and in combination with the visible light image a2 and the infrared image b2 of two frames of images acquired by the image capturing device before the time t1, the position information of the image capturing device at the time t1 is calculated according to the feature point information of the four frames of images. The embodiment comprehensively utilizes the characteristics of clear texture and high contrast of the infrared light image and the visible light image which can be acquired in a dark environment, and can improve the accuracy of the acquired position information.

Exemplary method

The method for calculating position information based on an image according to the embodiment can be applied to terminal equipment, and the terminal equipment can be terminal products with a video playing function, such as televisions, computers and the like. In this embodiment, as shown in fig. 1, the method for calculating position information based on an image specifically includes the following steps:

s100, acquiring a visible light image and an infrared image, wherein the visible light image and the infrared image are acquired at the same time through an image acquisition device.

In this embodiment, the image acquisition device is a camera, an image captured by a visible light RGB-D camera of the camera is a visible light image, an image captured by an infrared camera of the camera is an infrared image, and the infrared image may be a near-infrared image or a thermal infrared image. The present embodiment adopts a near-infrared image captured by an infrared camera. The visible light image and the infrared image of the present embodiment are both images generated by shooting the same scene.

S200, extracting visible characteristic points of the visible light image and infrared characteristic points corresponding to the visible characteristic points in the infrared image.

Step S200 includes feature point extraction of the visible light image and feature point extraction of the infrared image, and the feature point extraction processes for the two images are respectively described below:

extracting characteristic points of the visible light image: and (3) using an ORB feature point extraction method, wherein FAST corner points are used as feature points, and a BRIEF method is adopted to extract n pairs of descriptors around the feature points for calculation. In order to ensure the scale invariance of the features, the visible light image constructs an image pyramid to increase the number of feature points.

Extracting characteristic points of the infrared image: the method comprises the steps of firstly carrying out gamma conversion on an infrared image, and then carrying out median filtering to optimize the contrast of the infrared image so as to ensure that characteristic points can be extracted. Fig. 2 shows an infrared image after gamma conversion processing, and fig. 3 shows an infrared image after median filtering processing.

And S300, obtaining the position information of the image acquisition device according to the visible characteristic points and the infrared characteristic points. The step S300 includes the following steps S301, S302, S303, S304:

s301, according to the visible feature points, obtaining visible pixel point information in the region where the visible feature points are located in the visible light image.

The acquisition of the pixel point information near the feature point is to facilitate the matching of the feature point in the visible light image of the current frame with the feature point in the visible light image before the current frame, and only if the matching is accurately performed, the subsequent calculation can be performed through the matched feature point in the two frames of images to accurately acquire the pose of the image acquisition device (camera) when the current frame image is shot.

For visible light images, the matching process is as follows: and completely matching the feature points in the two frames of images by using a traversal search method to obtain a result shown in fig. 4. When the distance between the feature point and the pixel point is smaller than the threshold value for matching, the result shown in fig. 5 is obtained.

S302, according to the infrared characteristic points, obtaining infrared pixel point information in the area where the infrared characteristic points are located in the infrared image.

Because the infrared image has few textures and low contrast, the distances between infrared pixel points in two frames of infrared images are equal, so that the matching of feature points can not be carried out through the pixel points in the two frames of infrared images, and the probability of mismatching is increased even if the matching is forced. Therefore, in order to reduce the mismatching rate of the infrared image feature points, visible light description information can be added into the description information of the infrared feature points, the visible light information is fused into the infrared information, and the accuracy of infrared feature point matching is improved. Therefore, the infrared pixel point information in step S302 is the pixel point after the pixel point in the visible light image is fused, and the infrared pixel point information in this embodiment is the pixel point after the pixel point in the visible light image (infrared supplementary pixel point information) is added. The specific process comprises the following steps: according to the infrared region where the infrared characteristic point in the infrared image is located, visible pixel point information in a visible region corresponding to the infrared region in the visible light image is obtained, namely according to the infrared region where the infrared characteristic point is located, adjacent infrared pixel point pairs in the infrared region are obtained; comparing the pixel values of the adjacent pixel point pairs to obtain a comparison result; obtaining an adjacent visible pixel point pair matched with the comparison result in the visible light image according to the visible light image; obtaining a visible region of the adjacent visible pixel point pairs in the visible light image according to the adjacent visible pixel point pairs; and taking the visible pixel point information in the visible area as the infrared supplementary pixel point information.

In this embodiment, the pixel points (descriptors) around the acquired feature points in step S302 and step S301 are fusion descriptors based on nBRIEF descriptors, and the accuracy is higher than that of an infrared SLAM using only nBRIEF descriptors. If SIFT or SUFT is used for feature extraction, 128-dimensional vectors based on gradient histograms are used for description, which increases the time required for matching while increasing the accuracy of feature matching.

The principle of acquiring the infrared supplementary pixel point information is described in detail as follows:

the following formula is adopted for both the visible light image and the infrared image to obtain the corresponding area of the feature point in the infrared image in the visible light image.

Wherein p (x) and p (y) are the gray values of a pair of adjacent infrared pixel point pairs in the infrared image, and τ (p; x, y) is the result of comparing the gray values of the adjacent infrared pixel point pairs. When p (x) is greater than p (y), τ (p; x, y) has a value of 1, otherwise it is 0. Des_n(p) the accumulated values of tau (p; x, y) corresponding to all the adjacent infrared pixel point pairs. The above calculation process is also applied to the visible light image.

For example, 128 pairs of adjacent pairs of ir pixel points are located near one point of the ir image, each pair is compared, the comparison results for each pair are then accumulated to obtain an accumulated result s1, and 128 pairs of adjacent pairs of visible pixel points are also found in the visible image, and the accumulated result is obtained in the manner described above. When the two accumulation results are the same or similar, the 128 pairs of pixels in the area where the adjacent pixels are located in the visible light can be used as the infrared supplementary pixel information of the infrared image.

Similarly, feature points in two frames of infrared images added with visible light pixel point information are matched, and only when the matching is accurately performed, the subsequent calculation can be performed through the matched feature points in the two frames of images so as to accurately acquire the pose of the image acquisition device (camera) when the current frame of image is shot.

The characteristic point matching of the infrared image adopts the following formula:

Dist_(x,y)judging whether the two characteristic points are matched according to the distance between the characteristic points in the two frames of infrared images, wherein alpha and beta are constant coefficients, alpha + beta is 1, and Dis_i(x) The position of the finger x point canVisible descriptor i-th dimension information, Dis_i(y) denotes the ith dimension information of the y-point position visible light descriptor, R_i(x) I-th dimension information, R, of near-infrared descriptor at x-point position_iAnd (y) refers to ith dimension information of the y-point position near infrared descriptor.

By adopting the matching method of the embodiment, the matching effect graph shown in fig. 6 is obtained, and it can be known from the graph that the matching effect of the embodiment can improve the matching rate of the feature points.

The matching method of the present embodiment is feature point extraction based on the ORB method. Similar effects can be obtained by other extraction methods, wherein the precision of SIFT and SUFT is higher than that of ORB features, but the operation speed is slower than that of the ORB feature extraction method (30 frames of SIFT and SUFT require about 20 seconds to be extracted and completed, and ORB can be completed in less than one second). The method using single line feature extraction is slower than point feature extraction and also less accurate than ORB features.

And S303, acquiring a depth image corresponding to the visible light image and the infrared image.

The depth image of the embodiment is a depth image of a scene corresponding to the collected visible light image and infrared image.

S304, obtaining the pose in the position information according to the visible feature points, the infrared feature points, the visible pixel point information and the infrared pixel point information.

When this embodiment only uses the feature points and the pixel point information around the feature points to obtain the pose (position and azimuth) of the image capturing device, the method includes: obtaining a first frame image and a second frame image of visible light in the visible light image according to the visible light image, wherein the first frame image of visible light and the second frame image of visible light are obtained at different moments through the image acquisition device; according to the infrared image, obtaining an infrared first frame image and an infrared second frame image in the infrared image, wherein the infrared first frame image and the infrared second frame image are obtained at different moments through the image acquisition device; according to the infrared supplementary pixel point information in the infrared pixel point information, obtaining infrared characteristic points matched with the infrared first frame image and the infrared second frame image; obtaining visible feature points matched with the first frame image of the visible light and the second frame image of the visible light according to the visible feature points in the first frame image of the visible light, the visible feature points in the second frame image of the visible light and visible pixel point information corresponding to the visible feature points; and obtaining the pose in the position information according to the matched infrared characteristic points and the matched visible characteristic points.

The embodiment can also add the depth image on the basis of the characteristic points and the pixel point information around the characteristic points, and the three are combined to obtain the pose of the image acquisition device.

The pose of the image acquisition device can be optimized to obtain a more accurate pose. The optimization process comprises the following steps: obtaining a visible current frame image and a visible previous frame image in the visible image according to the visible image, wherein the time for obtaining the visible previous frame image through the image acquisition device is before the time for obtaining the visible current frame image; according to the infrared image, obtaining an infrared current frame image and an infrared previous frame image in the infrared image, wherein the time for obtaining the infrared previous frame image through the image acquisition device is before the time for obtaining the infrared current frame image; and optimizing the pose of the image acquisition device when acquiring the infrared current frame image and the visible current frame image according to the visible feature points and the visible pixel point information corresponding to the previous frame image of the visible light and the infrared feature points and the infrared pixel point information corresponding to the previous frame image of the infrared light, so as to obtain the optimized pose.

Optimizing the pose of the image capturing device essentially corrects the pose of the current image capturing device using the previously acquired pose data of the image capturing device.

Because this embodiment adopts both data of a visible light image and data of an infrared light image, when data of one of the images is missing, the method of the present invention can still obtain pose data of the image acquisition device, and the specific process includes: when the number corresponding to the matched infrared characteristic points is larger than an infrared set value and the number corresponding to the matched visible characteristic points is also larger than a visible set value, the pose in the position information is obtained through the matched visible characteristic points; and when one of the number corresponding to the matched infrared characteristic points and the number corresponding to the matched visible characteristic points is smaller than a set value, obtaining the pose in the position information through the other.

By way of example, the overall process of the invention is illustrated by the SLAM system (positioning and mapping):

as shown in fig. 7, two frames of images (a visible light image and an infrared image) shot by an image acquisition device (camera) at the same scene are firstly obtained, then feature points of the two images are respectively extracted, after feature extraction based on the two images is completed, two-dimensional feature points of each image can be obtained, and in cooperation with the depth image obtained by us, the feature points can be constructed into three-dimensional map points (a map is equivalent to storing the feature points contained in the images obtained by the camera to increase data volume and facilitate subsequent calculation of the pose of the camera), so that a common frame containing the feature points, matched description information (descriptors, namely pixel point information around the feature points) and three-dimensional map points corresponding to the feature points is constructed. After acquiring the normal frame, we feed the normal frame to the tracking module. The tracking module is used for matching and calculating the pose of the current frame in the world coordinate system by utilizing the information contained in the system. Fig. 8 shows the overall flow of the tracking module according to this embodiment.

The tracking module can be divided into two major parts: the system comprises a camera tracking module and a local map tracking module, wherein the essence of the camera tracking module is to detect whether a camera shoots a new image, and the essence of the local map tracking module is to extract feature points from the image shot by the camera and store the feature points. The following focuses on the camera tracking module:

a camera tracking module: and the camera model tracks and matches a frame generated before with the current frame, and calculates the pose of the current frame in a graph optimization mode. And optimizing the pose through local map tracking, and finally outputting a more accurate pose, wherein the camera tracking module also comprises reference key frame model tracking and motion model tracking, as shown in fig. 9, the reference key frame model tracking is to use a newly (closest to the current frame time) established key frame (the key frame can represent the frame image of the shooting scene characteristics), find the nearest key frame map point descriptor to the vectorized current frame feature point descriptor through a bag-of-words matching method, and then calculate the current pose through a graph optimization method. The motion tracking module is similar to the tracking flow of the reference key frame, and judges the tracking state by initializing the pose, matching, optimizing the pose and eliminating outliers. Unlike reference key frame tracking, reference key frame tracking directly uses key frame poses to initialize current frame poses, setting them to the poses of the reference key frame; and the motion tracking calculates the motion state of the system by using the pose transformation of the previous two frames, initializes the current pose by using the calculation result of the pose and the motion state of the previous frame, and matches the current feature point with the feature point of the previous frame by using a projection matching mode.

The tracking module is divided according to the tracked image category, the camera tracking module or the local map tracking module comprises a visible light tracking module and a near infrared tracking module, and the two modules are independent and interactive. When one of the tracks fails to work normally, the other set of tracks is not affected. The normal operation of the system under the condition of no illumination or under the condition of illumination with missing edges but rich textures is ensured, and the robustness of the system is improved. When the two tracking modules can work normally, the tracking accuracy can be effectively improved by visible light tracking.

In this embodiment, Absolute Track Error (ATE) is used to evaluate the poses obtained by ORB-SLAM2, ORB-SLAM3, LSD-SLAM, VI-SLAM-A in the prior art and VI-SLAM-B in the present invention, and ATE values in tables 1 to 4 are obtained.

As can be seen from tables 1 to 3, compared with single infrared work in a dark environment and a dim environment, in a bright environment, the information extracted by a visible light camera (for shooting a visible light image) can effectively increase the accuracy of system positioning (pose information of the camera); in a smooth and simple environment, the accuracy difference between the fusion SLAM, the single visible SLAM, and the dual-light SLAM without fusion is not high. In a complex environment, as shown in table four, the infrared SLAM system in the double-light SLAM without fusion has a larger gap in precision compared with the single-visible-light SLAM due to higher mismatching probability, while the double-light fusion SLAM (the method of the present invention) has a low mismatching rate and slightly better effect than the single-visible-light SLAM, wherein VI-SLAM-a is the double-light SLAM, VI-SLAM-B is the fusion feature algorithm of the present invention, and ORB-SLAM2 and ORB-SLAM3 are both single-visible-light SLAMs.

TABLE 1

TABLE 2

TABLE 3

TABLE 4

In summary, the invention comprehensively utilizes the advantages of the visible light image and the infrared image, so that the image acquisition device of the invention can analyze the visible light image and the infrared image acquired by the image acquisition device to acquire the feature points in the two images (the higher the contrast is, the clearer the texture is, the higher the quality of the acquired feature points is, the more beneficial to subsequent calculation) even if the image acquisition device works in a dark environment with insufficient illumination, and then the position information of the image acquisition device is acquired through the feature points, so that the image acquisition device can be further adjusted through the position information, and the image acquisition device can work better. Even if the image acquisition device works in a dark environment, the method provided by the invention can still accurately acquire the position information of the image acquisition device through the analysis.

Exemplary devices

The embodiment also provides a device of a method for calculating position information based on an image, which comprises the following components:

Based on the foregoing embodiment, the present invention further provides a terminal device, where the terminal device includes a memory, a processor, and a program for calculating location information based on an image, the program being stored in the memory and being executable on the processor, and the processor implements the step of calculating location information based on an image when executing the program for calculating location information based on an image.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

In summary, the present invention discloses a method, an apparatus, a device and a storage medium for calculating location information based on an image, wherein the method comprises: acquiring a visible light image and an infrared image, wherein the visible light image and the infrared image are acquired at the same time through an image acquisition device; extracting visible characteristic points of the visible light image and infrared characteristic points corresponding to the visible characteristic points in the infrared image; and obtaining the position information of the image acquisition device according to the visible characteristic points and the infrared characteristic points.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An image-based method of computing positional information, the positioning method comprising:

2. The method of claim 1, wherein obtaining the position information of the image capturing device according to the visible feature points and the infrared feature points comprises:

3. The method of claim 1, wherein obtaining the position information of the image capturing device according to the visible feature points and the infrared feature points comprises:

4. The image-based method for calculating position information according to claim 3, wherein the obtaining a pose in the position information according to the visible feature points, the infrared feature points, the visible pixel point information, and the infrared pixel point information comprises:

5. The image-based method for calculating position information according to claim 3, wherein the obtaining of the infrared pixel point information in the region of the infrared image where the infrared feature point is located according to the infrared feature point comprises:

6. The image-based method for calculating position information according to claim 5, wherein the obtaining visible pixel point information in a visible region corresponding to the infrared region in the visible light image according to the infrared region in which the infrared feature point in the infrared image is located, and marking the visible pixel point information in the visible region as supplemental infrared pixel point information comprises:

7. The image-based method for calculating position information according to claim 5, wherein the obtaining a pose in the position information according to the visible feature points, the infrared feature points, the visible pixel point information, and the infrared pixel point information comprises:

8. The image-based method for calculating position information according to claim 7, wherein the obtaining the pose in the position information according to the matched infrared feature point and the matched visible feature point comprises:

9. The image-based method for calculating position information according to claim 3, wherein the obtaining a pose in the position information according to the visible feature points, the infrared feature points, the visible pixel point information, and the infrared pixel point information comprises:

10. The method of image-based calculation of position information according to any of claims 1-9, wherein the visible light image is acquired by a visible light lens of the image acquisition device;

11. An apparatus of an image-based method of calculating position information, the apparatus comprising:

12. A terminal device comprising a memory, a processor, and a program for image-based calculation of position information stored in the memory and executable on the processor, wherein the processor implements the steps of image-based calculation of position information according to any one of claims 1 to 9 when executing the program for image-based calculation of position information.

13. A computer-readable storage medium, in which a program for image-based calculation of position information is stored, which program, when executed by a processor, carries out the steps of the method for image-based calculation of position information according to any one of claims 1 to 9.