CN111311681A

CN111311681A - Visual positioning method, device, robot and computer readable storage medium

Info

Publication number: CN111311681A
Application number: CN202010094997.7A
Authority: CN
Inventors: 支涛; 饶向荣
Original assignee: Beijing Yunji Technology Co Ltd
Current assignee: Beijing Yunji Technology Co Ltd
Priority date: 2020-02-14
Filing date: 2020-02-14
Publication date: 2020-06-19

Abstract

The invention relates to a visual positioning method, a visual positioning device, a robot and a computer readable storage medium, and belongs to the field of positioning. When the processor acquires a first image shot by the camera at the current moment, two-dimensional coordinates of at least six feature points are determined from the first image, and triangulated coordinates of the at least six feature points at the positions determined at the initial moment are known. Subsequently, the processor obtains a rotation matrix and a translation vector of the camera at the current time compared with the initial time according to the two-dimensional coordinates, the triangularization coordinates and the camera model of the at least six feature points, and accordingly the position of the camera at the current time is determined. In the method, the triangulated coordinates of the feature points are introduced, namely the feature points are restored to the three-dimensional environment, and more features of the feature points can be acquired. When the robot is moved, compared with the two-dimensional plane positioning in the prior art, more features acquired by the processor can effectively help the robot to realize positioning.

Description

Visual positioning method, device, robot and computer readable storage medium

Technical Field

The application belongs to the field of positioning, and particularly relates to a visual positioning method, a visual positioning device, a robot and a computer readable storage medium.

Background

At present, most indoor robots use a SLAM (Simultaneous localization and mapping) scheme as a 2D laser scheme. In the 2D laser scheme, the laser actively emits a light beam and receives the reflected light beam, and meanwhile, the positioning is realized by means of a odometer, so that the whole system is simpler, and the requirement on resources is lower. However, the laser map of the 2D laser scheme is a 2D point map, the features are not obvious enough, in addition, the odometer records the track of the wheel on the two-dimensional ground, that is, the 2D laser scheme is positioned on a two-dimensional layer, the feature points used for positioning by the robot do not have special marks, when the robot is manually moved, the feature points scanned twice before and after the movement cannot accurately establish the corresponding relationship and calculate the alignment transformation relationship, so that when the robot is moved, the positioning failure is easily caused, and the positioning recovery is difficult.

Disclosure of Invention

In view of the above, an object of the present invention is to provide a visual positioning method, a device, a robot and a computer-readable storage medium, which alleviate the problem of positioning failure when the robot is manually moved, and are helpful for positioning recovery of the robot.

The embodiment of the application is realized as follows:

in a first aspect, an embodiment of the present application provides a visual positioning method, which is applied to a robot, where a camera is disposed in the robot, and the camera captures an image according to a preset frequency, and the method includes: determining at least six feature points from a first image acquired at the current moment, wherein the triangulated coordinate of each feature point is determined by the camera according to the position of the camera at the initial moment; and calculating a rotation matrix and a displacement vector of the camera at the current moment compared with the initial moment according to the triangulated coordinates of each feature point, the two-dimensional coordinates of the feature point in the first image and a projection model of the camera, so as to obtain the position of the camera at the current moment. In the method, the triangulated coordinates of the feature points are introduced, namely the feature points are restored to the three-dimensional environment, and more features of the feature points can be acquired. When the robot is moved, compared with two-dimensional plane positioning in the prior art, more characteristics acquired by the processor can effectively help the robot to realize positioning, the problem of positioning failure is relieved, and positioning recovery of the robot is facilitated.

With reference to the embodiment of the first aspect, in a possible implementation manner, a laser positioning device is further disposed in the robot, and when the initial time is an initial time, before determining at least six feature points in the first image acquired from the current time, the method further includes: acquiring a second image and a third image which are overlapped at the initial moment; respectively calculating descriptors of the feature points in the second image and the third image, and determining at least eight feature point pairs corresponding to features from the second image and the third image; determining a rotation matrix and a displacement vector of the camera at the initial moment according to an eight-point method and two-dimensional coordinates of the at least eight characteristic point pairs in the images to which the characteristic points belong; determining the scale of the displacement vector at the initial moment according to the positioning information of the laser positioning device on one characteristic point pair; and determining the position of the camera at the initial moment according to the rotation matrix, the displacement vector and the scale at the initial moment.

With reference to the embodiment of the first aspect, in a possible implementation manner, after the determining the position of the camera at the initial time, before determining at least six feature points in the first image acquired from the current time, the method further includes: and determining the triangulated coordinates of the characteristic points corresponding to each characteristic point pair at the initial time according to the two-dimensional coordinates of the characteristic point pair in the image to which the characteristic point pair belongs, the position of the camera at the initial time and a least square method.

With reference to the embodiment of the first aspect, in a possible implementation manner, the determining, for each feature point pair, triangulated coordinates of a feature point corresponding to each feature point pair at the initial time according to two-dimensional coordinates of the feature point pair in the image to which the feature point pair belongs, a position of the camera at the initial time, and a least square method includes: according to the formula

Determining triangularized coordinates of the corresponding feature point of each feature point pair at the initial time, wherein for each feature point, the triangularized coordinates are obtainedThe coordinates in the second image and the third image are (u) respectively₀，v₀)、(u₁，v₁)，P1＝K[R₁，t₁]，P2＝K[R₂，t₂]，R_iRepresenting the rotation matrix of the camera at time i, t_iIs the translation vector at that time, R₁Is K [ I ]_3x3，O_3x1]I is an identity matrix, O is a zero matrix, and (x, y, z) is the resulting triangulated coordinate.

With reference to the embodiment of the first aspect, in a possible implementation manner, after the calculating obtains the rotation matrix and the displacement vector of the camera at the current time compared to the initial time, the method further includes: and determining the triangularized coordinates of a plurality of characteristic points in the first image at the current moment according to the position of the camera at the current moment and a least square method.

In a second aspect, an embodiment of the present application provides a visual positioning apparatus, which is applied to a robot, a camera is disposed in the robot, the camera captures an image according to a preset frequency, and the apparatus includes: the determining module is used for determining at least six feature points from the first image acquired at the current moment, and the triangularization coordinate of each feature point is determined by the camera according to the position of the camera at the initial moment; and the calculation module is used for calculating a rotation matrix and a displacement vector of the camera at the current moment compared with the initial moment according to the triangulated coordinates of each feature point, the two-dimensional coordinates of each feature point in the first image and a projection model of the camera, so as to obtain the position of the camera at the current moment.

With reference to the second aspect, in a possible implementation manner, a laser positioning device is further disposed in the robot, and the device further includes an obtaining module, where when the initial time is an initial time, the obtaining module is configured to obtain, at the initial time, a second image and a third image that are overlapped; the determining module is further configured to calculate descriptors of the feature points in the second image and the third image, and determine at least eight feature point pairs corresponding to the features from the second image and the third image; determining a rotation matrix and a displacement vector of the camera at the initial moment according to an eight-point method and two-dimensional coordinates of the at least eight characteristic point pairs in the images to which the characteristic points belong; determining the scale of the displacement vector at the initial moment according to the positioning information of the laser positioning device on one characteristic point pair; and determining the position of the camera at the initial moment according to the rotation matrix, the displacement vector and the scale at the initial moment.

With reference to the second aspect, in a possible implementation manner, the determining module is further configured to determine, for each feature point pair, a triangulated coordinate of the feature point corresponding to each feature point pair at the initial time according to the two-dimensional coordinate of the feature point pair in the image to which the feature point pair belongs, the position of the camera at the initial time, and a least square method.

With reference to the second aspect, in one possible implementation manner, the determining module is configured to determine the second aspect according to a formula

Determining triangularized coordinates of the corresponding feature point of each feature point pair at the initial moment, wherein the coordinates of each feature point in the second image and the third image are (u) respectively₀，v₀)、(u₁，v₁)，P1＝K[R₁，t₁]，P2＝K[R₂，t₂]，R_iRepresenting the rotation matrix of the camera at time i, t_iIs the translation vector at that time, R₁Is K [ I ]_3x3，O_3x1]I is an identity matrix, O is a zero matrix, and (x, y, z) is the resulting triangulated coordinate.

With reference to the second aspect, in a possible implementation manner, the determining module is further configured to determine triangulated coordinates of a plurality of feature points in the first image at the current time according to a position of the camera at the current time and a least square method.

In a third aspect, an embodiment of the present application further provides a robot including: a memory and a processor, the memory and the processor connected; the memory is used for storing programs; the processor calls a program stored in the memory to perform the method of the first aspect embodiment and/or any possible implementation manner of the first aspect embodiment.

In a fourth aspect, the present application further provides a non-transitory computer-readable storage medium (hereinafter, referred to as a computer-readable storage medium), on which a computer program is stored, where the computer program is executed by a computer to perform the method in the foregoing first aspect and/or any possible implementation manner of the first aspect.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts. The foregoing and other objects, features and advantages of the application will be apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the drawings. The drawings are not intended to be to scale as practical, emphasis instead being placed upon illustrating the subject matter of the present application.

Fig. 1 shows a schematic structural diagram of a robot provided in an embodiment of the present application.

Fig. 2 shows a flowchart of a visual positioning method provided in an embodiment of the present application.

Fig. 3 shows a schematic diagram for establishing a rectangular coordinate system based on an image according to an embodiment of the present application.

Fig. 4 shows a block diagram of a visual positioning apparatus according to an embodiment of the present application.

Reference numbers: 100-a robot; 110-a processor; 120-a memory; 130-a camera; 140-laser positioning means; 400-a visual positioning device; 410-a determination module; 420-calculation module.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, relational terms such as "first," "second," and the like may be used solely in the description herein to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Further, the term "and/or" in the present application is only one kind of association relationship describing the associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone.

The embodiment of the application provides a visual positioning method, a visual positioning device, a robot and a computer readable storage medium, which are used for relieving the problem that the robot is ineffective in positioning when being manually moved and are beneficial to the recovery of the positioning of the robot. The technology can be realized by adopting corresponding software, hardware and a combination of software and hardware. The following describes embodiments of the present application in detail.

First, a robot 100 for implementing the visual positioning method and apparatus according to the embodiment of the present application is described with reference to fig. 1.

Optionally, the robot 100 may include: processor 110, memory 120, camera 130, laser positioning device 140, and the like.

It should be noted that the components and configuration of the robot 100 shown in fig. 1 are exemplary only, and not limiting, and that the robot 100 may have other components and configurations as desired.

The processor 110, memory 120, camera 130, laser positioning device 140, and other components that may be present in the robot 100 are electrically connected to each other, directly or indirectly, to enable the transmission or interaction of data. For example, the processor 110, the memory 120, the camera 130, the laser positioning device 140, and other components that may be present may be electrically connected to each other via one or more communication buses or signal lines.

The memory 120 is used for storing a program, for example, a program corresponding to a visual positioning method appearing later or a visual positioning device appearing later. Optionally, when the visual positioning apparatus is stored in the memory 120, the visual positioning apparatus includes at least one software function module that can be stored in the memory 120 in the form of software or firmware (firmware).

Alternatively, the software function module included in the visual positioning apparatus may also be solidified in an Operating System (OS) of the robot 100.

The processor 110 is adapted to execute executable modules stored in the memory 120, such as software functional modules or computer programs comprised by the visual positioning apparatus. When the processor 110 receives the execution instruction, it may execute the computer program, for example, to perform: determining at least six feature points from a first image acquired at the current moment, wherein the triangulated coordinate of each feature point is determined by the camera according to the position of the camera at the initial moment; and calculating a rotation matrix and a displacement vector of the camera at the current moment compared with the initial moment according to the triangulated coordinates of each feature point, the two-dimensional coordinates of the feature point in the first image and a projection model of the camera, so as to obtain the position of the camera at the current moment.

Of course, the method disclosed in any of the embodiments of the present application can be applied to the processor 110, or implemented by the processor 110.

The visual positioning method provided by the present application will be described with reference to fig. 2.

Step S110: at least six feature points are determined from the first image acquired at the current moment, and the triangulated coordinates of each feature point are determined by the camera according to the position of the camera at the initial moment.

Step S120: and calculating a rotation matrix and a displacement vector of the camera at the current moment compared with the initial moment according to the triangulated coordinates of each feature point, the two-dimensional coordinates of the feature point in the first image and a projection model of the camera, so as to obtain the position of the camera at the current moment.

In the embodiment of the application, the position of the camera at a certain moment is determined, namely, the rotation matrix (R) and the displacement vector (t) of the camera at the moment compared with the initial moment are determined.

In order to determine the position of the camera, in the embodiment of the application, the camera is arranged in the robot, and during the operation of the robot, the camera continuously takes images according to a preset frequency, and then the taken images are sent to the processor for processing, so that the processor can determine the position of the camera based on the images.

When the processor acquires the image at the current time T1 (the image corresponding to the current time T1 is referred to as a first image Q1), the processor starts to determine the position S1 where the camera is located at the current time T1 from Q1, and the determination is performed as follows.

(1) At least six feature points and the two-dimensional coordinates of each feature point in Q1 are obtained from Q1.

After the processor acquires the image, a rectangular coordinate system can be established on the basis of the acquired image, so that two-dimensional coordinates of certain feature points in the image are determined. For example, when the processor acquires the first image Q1 captured by the camera at the current time T1, a rectangular coordinate system may be established with the midpoint of Q1 as the origin of coordinates, the direction parallel to the horizon as the direction of the X axis, and the direction perpendicular to the horizon as the direction of the Y axis, as shown in fig. 3. After establishing the rectangular coordinate system, the processor may determine the two-dimensional coordinates of the feature points in Q1.

(2) Triangulated coordinates of the at least six feature points at the camera position S0 determined at the initial time T0 are obtained.

It is to be noted that, at an initial time T0 before the processor determines the position where the camera is located at the current time T1 from Q1, the processor also acquires an image corresponding to the initial time T0 (the image corresponding to the initial time T0 is referred to as an initial image Q0). It will be appreciated that there are a number of common characteristic points in Q1 and Q0. Wherein at least six characteristic points in step (1) occur in both Q0 and Q1.

Further, the processor has determined the position S0 of the camera at the initial time T0 (i.e., R and T at the initial time T0) from Q0 at the initial time T0, and thus, the processor may triangulate the feature points in Q0 based on R and T at the initial time T0, resulting in triangulated coordinates for each feature point in Q0.

Note that, before the initial time T0, the camera has previously captured a plurality of images. There are multiple overlapping pictures between the multiple images shot in advance and Q0. Assuming that the image acquired by the processor at a time T3 before the initial time T0 is the third image Q3, for a feature point a to be triangulated in Q0, the two-dimensional coordinate of the feature point a in Q0 is (u) as a result of the assumption that the feature point a is a feature point a whose triangulated coordinate is calculated₀，v₀) The two-dimensional coordinate of Q3 corresponding to this feature point A at T3 is (u)₁，v₁) Let the parameter P1 be K R₁，t₁]，P2＝K[R₂，t₂]，R_iRepresenting the rotation matrix of the camera at time i, t_iIs the translation vector at that time, R₁Is K [ I ]_3x3，O_3x1]I is the identity matrix, O is the zero matrix, and K is the internal reference of the camera (which can be predetermined based onParameter acquisition of the camera). After obtaining the above parameters, the processor bases on the formula

The triangulated coordinates (x, y, z) of this feature point a in Q0 at the initial time T0 are determined.

The calculation process is as follows:

the matrix is calculated by a least squares SVD. In the above matrix, the coefficient matrix is U Σ V^T(and u)₀、v₀、u₁、v₁P1, P2), where Σ, U, V are both matrices. Suppose V last column is [ p, q, r, w]Then [ x, y, z)]＝[p/w，q/w，r/w]. Since the process of calculating the matrix according to the least square method is prior art, it is not described here in detail.

Through the above calculation, the triangulated coordinates of the feature point a at the position S0 of the camera determined at the initial time T0 can be determined. Similarly, the processor may calculate triangulated coordinates of a plurality of feature points in Q0 at the camera 'S position S0 for subsequent use in determining the camera' S position S1 at the current time T1.

(3) Substituting the two-dimensional coordinates and triangulated coordinates corresponding to each feature point into a projection model of the camera

At least six sets of equations are obtained.

(4) And simultaneously solving the six sets of equations to obtain R and T at T1, thereby determining S1.

Where S is a size scaling factor, K is camera parameters (which can be obtained in advance according to camera parameters), R is a rotation matrix (3 × 3 matrix) of the camera transformed from the space where the initial time is located to the space where the current time is located, t is a displacement vector (3 × 1 matrix) of the camera transformed from the space where the initial time is located to the space where the current time is located, (u, v) is a two-dimensional coordinate of the feature point in Q1, and (x, y, z) is a triangulated coordinate of the feature point under S0.

Optionally, after determining S1, the processor may also determine triangulated coordinates of a plurality of feature points in Q1 according to the process in step (2), so that when the processor acquires an image at the next time, the processor determines the position of the camera at the next time.

In the above-described process of determining the position S1 at which the camera is located at the current time T1, the position S1 is acquired based on the position S0 of the camera at the initial time T0. The initial moment is the robot start-up moment, at which time the processor determines its camera position at the initial moment as follows.

In the system starting stage, two images are obtained, feature point pairs corresponding to at least eight features are determined from the two images, and two-dimensional coordinates of each feature point pair in the image to which the feature point pair belongs are obtained.

Wherein a plurality of common feature points exist in the two images. When a certain feature point a exists in both images, the different expressions of the feature point a in both images are called a feature point pair. Wherein, assume that the two-dimensional coordinate of A in one of the images is (x)₁，y₁1), A has two-dimensional coordinates (x) in the other image₁’，y₁’，1)。

Similarly, it can be assumed that the two-dimensional coordinate of B in one of the images is (x)₂，y₂1), B has two-dimensional coordinates (x) in the other image₂’，y₂', 1); it can be assumed that C has a two-dimensional coordinate of (x) in one of the images₃，y₃1), the two-dimensional coordinate of C in the other image is (x)₃’，y₃', 1); it can be assumed that D has a two-dimensional coordinate of (x) in one of the images₄，y₄1), D has two-dimensional coordinates (x) in the other image₄’，y₄', 1); it can be assumed that E has a two-dimensional coordinate of (x) in one of the images₅，y₅1), E has two-dimensional coordinates (x) in the other image₅’，y₅', 1); it can be assumed that the two-dimensional coordinate of F in one of the images is (x)₆，y₆1), the two-dimensional coordinate of F in the other image is (x)₆’，y₆', 1); it can be assumed that G has two-dimensional coordinates (x) in one of the images₇，y₇1), G has two-dimensional coordinates (x) in the other image₇’，y₇', 1); it can be assumed that H has a two-dimensional coordinate of (x) in one of the images₈，y₈1), H has two-dimensional coordinates (x) in the other image₈’，y₈',1). Substituting the two-dimensional coordinates of the eight characteristic point pairs into a basic matrix equation p₂ ^TFp₁When 0, we can get:

the basis matrix F can be calculated by the eight-point method, wherein

Decomposing F based on least square method SVD, wherein coefficient matrix is UDV^TTaking V last 1 as F' (F)₁₁，f₂₂，f₃₃). SVD is carried out on F', and since the rank of the basic matrix should be 2, U Diag (r, s, t) V is obtained^T，F＝UDiag(r，s，0)V^T。

In addition, since the shooting position of the latter image is changed from the shooting position of the former image with respect to the two images, F-K can be obtained^-Tt_×RK^-1Where K is the camera parameter, and E is t × R, where t is a translation vector of the shooting position of the next image compared with the shooting position of the previous image, and R is a rotation matrix of the shooting position of the next image compared with the shooting position of the previous image. Suppose E ═ U Σ V^TAnd let t1 be U (: 2), t2 be-t 1,

four cases (R1, t1), (R1, t2), (R2, t1), (R2, t2) were obtained. Because both R and t are positive values, after coordinates of a certain characteristic point are substituted, a group of R with numerical values both being positive valuesT is determined as the initial position.

It should be noted that, since the relative spatial relationship constructed by one camera can be arbitrarily scaled and still satisfy the camera projection relationship, the scale of t obtained by the above method is uncertain. In order to determine the scale relationship, in the robot provided in the embodiment of the present application, a laser positioning device is further provided to assist the camera in determining the scale of the translation vector t.

Alternatively, the laser positioning device may determine the positioning information of one feature point pair in the two images, assuming that the positioning information is M1 and M2, respectively, and therefore, the displacement d of the feature point in the period of time when the camera captures the two images may be calculated as | M2-M1|, and thus, the laser positioning device may determine that the feature point pair is located in the two images

Therefore, the rotation matrix R and the translation vector t' of the camera at the initial time are finally determined. At the next subsequent time, R and t of the camera at the next time are determined based on R and t'.

It should be noted that the positioning of the position by the robot through the laser positioning device is prior art and will not be described herein.

In addition, in the embodiment of the present application, it is determined whether there is a feature point corresponding to a feature in the two images by calculating an orb (organized FASTand Rotated brief) descriptor of the key point in each image. The following description will be made with respect to the ORB descriptor in the computed image.

For an image, the pixel value of a certain pixel point in the image can be extracted, and a 16-field composed of 16 adjacent pixel points is constructed by taking the pixel point as the center. And when the gray values of the pixel values of 9 pixel points in the 16 fields are all larger or all smaller than the gray value of the pixel point, determining the pixel point as a key point.

In order to ensure that the descriptor obtained by subsequent calculation has rotation invariance, the centroid C and the rotation direction theta of the key point can be calculated by the following formula:

wherein I (x, y) represents the gray scale value of the keypoint pixel values.

For each key point, after determining the centroid and the rotation direction thereof, the BRIEF descriptor (composed of numbers 0 and 1) of the key point is calculated, and the BRIEF descriptor with the rotation invariance, namely the ORB descriptor, is obtained. The BRIEF descriptor for calculating the key points is the prior art, and is not described herein again.

After the ORB descriptors of each key point in each image are obtained, for two different images, each ORB descriptor included in the two different images can be compared, and when the similarity of the two ORB descriptors in the two images reaches a threshold, the feature points corresponding to the two descriptors are determined to be feature point pairs corresponding to features.

According to the visual positioning method provided by the embodiment of the application, when the processor acquires the first image shot by the camera at the current moment, the two-dimensional coordinates of at least six feature points are determined from the first image, and the triangulated coordinates of the at least six feature points at the positions determined at the initial moment are known. Subsequently, the processor obtains a rotation matrix and a translation vector of the camera at the current time compared with the initial time according to the two-dimensional coordinates, the triangulated coordinates and the camera model of the at least six feature points, and accordingly the position of the camera at the current time is determined. In the method, the triangulated coordinates of the feature points are introduced, namely the feature points are restored to the three-dimensional environment, and more features of the feature points can be acquired. When the robot is moved, compared with two-dimensional plane positioning in the prior art, more characteristics acquired by the processor can effectively help the robot to realize positioning, the problem of positioning failure is relieved, and positioning recovery of the robot is facilitated.

As shown in fig. 4, an embodiment of the present application further provides a visual positioning apparatus 400, where the visual positioning apparatus 400 may include: a determination module 410 and a calculation module 420.

A determining module 410, configured to determine at least six feature points from a first image acquired at a current time, where a triangulated coordinate of each feature point is determined by the camera according to a position of the camera at an initial time;

a calculating module 420, configured to calculate, according to the triangulated coordinates of each feature point, the two-dimensional coordinates of each feature point in the first image, and a projection model of the camera, a rotation matrix and a displacement vector of the camera at the current time, which are compared with the initial time, and obtain a position of the camera at the current time.

Optionally, in a possible implementation manner, a laser positioning device is further disposed in the robot, and the device further includes an obtaining module, where when the initial time is an initial time, the obtaining module is configured to obtain, at the initial time, a second image and a third image that are overlapped; the determining module 410 is further configured to calculate descriptors of feature points in the second image and the third image, respectively, and determine at least eight feature point pairs corresponding to features from the second image and the third image; determining a rotation matrix and a displacement vector of the camera at the initial moment according to an eight-point method and two-dimensional coordinates of the at least eight characteristic point pairs in the images to which the characteristic points belong; determining the scale of the displacement vector at the initial moment according to the positioning information of the laser positioning device on one characteristic point pair; and determining the position of the camera at the initial moment according to the rotation matrix, the displacement vector and the scale at the initial moment.

Optionally, in a possible implementation manner, the determining module 410 is further configured to determine, for each feature point pair, a triangulated coordinate of the feature point corresponding to each feature point pair at the initial time according to the two-dimensional coordinate of the feature point pair in the image to which the feature point pair belongs, the position of the camera at the initial time, and a least square method.

Optionally, in a possible implementation, the determining module 410 is configured to determine the value of the parameter according to a formula

Optionally, in a possible implementation manner, the determining module 410 is further configured to determine triangulated coordinates of a plurality of feature points in the first image at the current time according to a position of the camera at the current time and a least square method.

The visual positioning apparatus 400 provided in the embodiment of the present application has the same implementation principle and the same technical effects as those of the foregoing method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the foregoing method embodiments for the parts of the apparatus embodiments that are not mentioned.

In addition, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a computer, the steps included in the above-mentioned visual positioning method are performed.

In summary, according to the visual positioning method, the visual positioning apparatus, the robot, and the computer-readable storage medium provided by the embodiments of the present invention, when the processor acquires the first image captured by the camera at the current time, the two-dimensional coordinates of at least six feature points are determined from the first image, and the triangulated coordinates of the at least six feature points at the positions determined at the initial time are known. Subsequently, the processor obtains a rotation matrix and a translation vector of the camera at the current time compared with the initial time according to the two-dimensional coordinates, the triangulated coordinates and the camera model of the at least six feature points, and accordingly the position of the camera at the current time is determined. In the method, the triangulated coordinates of the feature points are introduced, namely the feature points are restored to the three-dimensional environment, and more features of the feature points can be acquired. When the robot is moved, compared with two-dimensional plane positioning in the prior art, more characteristics acquired by the processor can effectively help the robot to realize positioning, the problem of positioning failure is relieved, and positioning recovery of the robot is facilitated.

It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a notebook computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application.

Claims

1. A visual positioning method is applied to a robot, a camera is arranged in the robot, and the camera shoots images according to a preset frequency, and the method comprises the following steps:

determining at least six feature points from a first image acquired at the current moment, wherein the triangulated coordinate of each feature point is determined by the camera according to the position of the camera at the initial moment;

and calculating a rotation matrix and a displacement vector of the camera at the current moment compared with the initial moment according to the triangulated coordinates of each feature point, the two-dimensional coordinates of the feature point in the first image and a projection model of the camera, so as to obtain the position of the camera at the current moment.

2. The method of claim 1, wherein a laser positioning device is further disposed within the robot, and wherein before the at least six feature points are determined in the first image acquired from the current time, the method further comprises:

acquiring a second image and a third image which are overlapped at the initial moment;

respectively calculating descriptors of the feature points in the second image and the third image, and determining at least eight feature point pairs corresponding to features from the second image and the third image;

determining a rotation matrix and a displacement vector of the camera at the initial moment according to an eight-point method and two-dimensional coordinates of the at least eight characteristic point pairs in the images to which the characteristic points belong;

determining the scale of the displacement vector at the initial moment according to the positioning information of the laser positioning device on one characteristic point pair;

and determining the position of the camera at the initial moment according to the rotation matrix, the displacement vector and the scale at the initial moment.

3. The method of claim 2, wherein after the determining the position of the camera at the initial time, prior to determining at least six feature points in the first image acquired from the current time, the method further comprises:

and determining the triangulated coordinates of the characteristic points corresponding to each characteristic point pair at the initial time according to the two-dimensional coordinates of the characteristic point pair in the image to which the characteristic point pair belongs, the position of the camera at the initial time and a least square method.

4. The method according to claim 3, wherein the determining, for each feature point pair, triangulated coordinates of the feature point corresponding to each feature point pair at the initial time according to the two-dimensional coordinates of the feature point pair in the image to which the feature point pair belongs, the position of the camera at the initial time, and a least square method comprises:

according to the formula

5. The method of claim 1, wherein after the calculating obtains a rotation matrix and a displacement vector of the camera at a current time compared to the initial time, the method further comprises:

and determining the triangularized coordinates of a plurality of characteristic points in the first image at the current moment according to the position of the camera at the current moment and a least square method.

6. A visual positioning device, characterized in that, is applied to the robot, be provided with the camera in the robot, the camera shoots the image according to preset frequency, the device includes:

the determining module is used for determining at least six feature points from the first image acquired at the current moment, and the triangularization coordinate of each feature point is determined by the camera according to the position of the camera at the initial moment;

and the calculation module is used for calculating a rotation matrix and a displacement vector of the camera at the current moment compared with the initial moment according to the triangulated coordinates of each feature point, the two-dimensional coordinates of each feature point in the first image and a projection model of the camera, so as to obtain the position of the camera at the current moment.

7. The apparatus of claim 6, further comprising a laser positioning device disposed within the robot, the apparatus further comprising an acquisition module,

the acquisition module is used for acquiring a second image and a third image which are overlapped at the initial moment;

the determining module is further configured to calculate descriptors of the feature points in the second image and the third image, and determine at least eight feature point pairs corresponding to the features from the second image and the third image; determining a rotation matrix and a displacement vector of the camera at the initial moment according to an eight-point method and two-dimensional coordinates of the at least eight characteristic point pairs in the images to which the characteristic points belong; determining the scale of the displacement vector at the initial moment according to the positioning information of the laser positioning device on one characteristic point pair; and determining the position of the camera at the initial moment according to the rotation matrix, the displacement vector and the scale at the initial moment.

8. The apparatus of claim 6, wherein the determining module is further configured to determine triangulated coordinates of a plurality of feature points in the first image at the current time according to a position of the camera at the current time and a least square method.

9. A robot, comprising: a memory and a processor, the memory and the processor connected;

the memory is used for storing programs;

the processor calls a program stored in the memory to perform the method of any of claims 1-5.

10. A computer-readable storage medium, on which a computer program is stored which, when executed by a computer, performs the method of any one of claims 1-5.