CN113011231A

CN113011231A - Classified sliding window method, SLAM positioning method, system and electronic equipment thereof

Info

Publication number: CN113011231A
Application number: CN201911326341.7A
Authority: CN
Inventors: 兰国清; 周俊; 黄菊; 胡增新
Original assignee: Sunny Optical Zhejiang Research Institute Co Ltd
Current assignee: Sunny Optical Zhejiang Research Institute Co Ltd
Priority date: 2019-12-20
Filing date: 2019-12-20
Publication date: 2021-06-22
Anticipated expiration: 2039-12-20
Also published as: CN113011231B

Abstract

A classification sliding window method, an SLAM positioning method, a system and an electronic device thereof are provided. The classification sliding window method comprises the following steps: s110: determining whether the number of all observation frames in the window reaches the maximum window number of the window; s120: when the number of the observation frames reaches the maximum window number, according to the relative pose between the oldest first frame and the oldest second frame in the window, batch removing a preset number of observation frames from the window at intervals; and S130: when the number of the observation frames is smaller than the maximum window number, further determining whether the number of the observation frames is larger than a preset frame number threshold value, if so, selectively removing the observation frames from the window according to the characteristic point tracking rate of the current observation frames; if not, all observation frames in the window are reserved.

Description

Classified sliding window method, SLAM positioning method, system and electronic equipment thereof

Technical Field

The invention relates to the technical field of SLAM, in particular to a classification sliding window method, an SLAM positioning method, a system and electronic equipment thereof.

Background

With the continuous improvement of computing technology, intelligent technology And sensor technology, intelligent applications such as AR/VR, unmanned aerial vehicle And intelligent robot are coming into the market, so that positioning And Mapping (SLAM) technology is also receiving attention. In general, the SLAM problem can be basically divided into two parts: the front end mainly processes data acquired by the sensors and converts the data into relative poses or forms which can be understood by other machines; the back end mainly deals with the problem of optimal posterior estimation, namely optimal estimation of poses or maps and the like. In the SLAM positioning technology at the present stage, a Visual Inertial Odometer (VIO) mode is generally adopted for pose estimation, and the scheme has high positioning precision and stable effect, so that the visual inertial odometer can be widely applied.

At present, the open source algorithm of the visual inertial odometer is complicated, so that the method can be divided into a filtering optimization method and a nonlinear optimization method according to different optimization methods of a rear end. For the filtering optimization method, because the state vector dimension and the covariance matrix in the filtering optimization method are relatively small, the filtering optimization method has small calculated amount and high speed, and further can realize positioning in a fast scene; a typical open source algorithm in the filtering optimization method is S-MSCKF, but the positioning accuracy is low and the robustness is not good. For the nonlinear optimization method, the nonlinear optimization method needs to maintain a global map and a global key frame, so that the nonlinear optimization method has large calculation amount and poor real-time performance; the classic representative of the nonlinear optimization method is VIN-Mono, which can operate well in most scenes, but has higher requirements on CPU resources and poor real-time performance.

In addition, the filtering optimization method such as S-MSCKF usually employs window preservation for each observation frame input to the backend, and deletes the observation frame preserved in the window when the number of observation frames in this window satisfies the maximum storable number (i.e. full window). Specifically, when a window is full, firstly, starting from the latest penultimate frame, calculating the relative pose between the latest penultimate frame and the latest penultimate frame, and if the relative pose between the latest penultimate frame and the latest penultimate frame meets a certain threshold, rejecting the latest penultimate frame; if not, the oldest first frame is culled. Then, calculating the relative pose between the latest penultimate frame and the latest penultimate frame, and if the relative pose between the latest penultimate frame and the latest penultimate frame meets a certain threshold, rejecting the latest penultimate frame; if not, the oldest second frame is culled. And finally, inputting the feature points on the two eliminated observation frames into a filter for filtering optimization, thereby obtaining positioning information.

However, compared to the EKF-SLAM, the filtering optimization method such as S-MSCKF improves the real-time performance and the positioning accuracy, but the positioning accuracy still does not meet the requirement in applications such as AR/VR, and in the case of complex motion conditions, unrecoverable drift may occur.

Disclosure of Invention

An advantage of the present invention is to provide a classification sliding window method and a SLAM positioning method, and a system and an electronic device thereof, which can improve positioning accuracy so as to meet the requirement of applications such as AR/VR on positioning accuracy.

Another advantage of the present invention is to provide a classification sliding window method, a SLAM positioning method, a system and an electronic device thereof, wherein in an embodiment of the present invention, the classification sliding window method can optimize the constraint of a plurality of observation frames on feature points, and improve the positioning accuracy.

Another advantage of the present invention is to provide a classification sliding window method, a SLAM positioning method, a system and an electronic device thereof, wherein in an embodiment of the present invention, the classification sliding window method can start different sliding window methods for different observation information, thereby further improving positioning accuracy.

Another advantage of the present invention is to provide a classification sliding window method and a SLAM positioning method, and a system and an electronic device thereof, wherein in an embodiment of the present invention, the classification sliding window method can reserve a proper number of observation frames to reduce the amount of back-end computation, which is helpful to improve the overall real-time performance.

Another advantage of the present invention is to provide a classification sliding window method, a SLAM positioning method, a system and an electronic device thereof, wherein in an embodiment of the present invention, the SLAM positioning method can accelerate a front-end processing speed, thereby further improving overall real-time performance and meeting a requirement of applications such as AR/VR on real-time performance.

Another advantage of the present invention is to provide a classification sliding window method, a SLAM positioning method, a system and an electronic device thereof, wherein in an embodiment of the present invention, the SLAM positioning method can combine an optical flow tracking method and an epipolar search and block matching method to process a front end, so as to reduce errors in left and right eye feature tracking and increase a front end processing speed, thereby further improving overall real-time performance.

Another advantage of the present invention is to provide a classification sliding window method and a SLAM positioning method, and a system and an electronic device thereof, in which a complicated structure and a large amount of calculation are not required in the present invention in order to achieve the above advantages. Therefore, the present invention successfully and effectively provides a solution, which not only provides a classification sliding window method and a SLAM positioning method, and a system and an electronic device thereof, but also increases the practicability and reliability of the classification sliding window method and the SLAM positioning method, and the system and the electronic device thereof.

To achieve at least one of the above advantages or other advantages and objects, the present invention provides a classified sliding window method, comprising the steps of:

s110: determining whether the number of all observation frames in the window reaches the maximum window number of the window;

s120: when the number of the observation frames reaches the maximum window number, according to the relative pose between the oldest first frame and the oldest second frame in the window, batch removing a preset number of observation frames from the window at intervals; and

s130: when the number of the observation frames is smaller than the maximum window number, further determining whether the number of the observation frames is larger than a preset frame number threshold value, if so, selectively removing the observation frames from the window according to the characteristic point tracking rate of the current observation frames; if not, all observation frames in the window are reserved.

In an embodiment of the present invention, the method for classifying sliding windows further includes the steps of:

s140: the current observation frame is added to the window as the latest frame in the window.

In an embodiment of the present invention, the step S120 includes the steps of:

calculating the relative pose between the oldest first frame and the oldest second frame in the window to judge whether the relative pose is greater than a first pose threshold value;

when the relative pose is larger than the first pose threshold value, the oldest first frame in the window is removed, and a first preset number of observation frames are removed in batches at intervals from the oldest second frame in the window; and

when the relative pose is not larger than the first pose threshold value, the oldest first frame in the window is reserved, and a second preset number of observation frames are batch-culled at intervals from the oldest second frame in the window.

In an embodiment of the present invention, in the step S120, the observation frames in the window are batch-culled at equal intervals, starting from the oldest second frame in the window.

In an embodiment of the present invention, the step S130 includes the steps of:

detecting the characteristic point tracking rate of the current observation frame to determine whether the characteristic point tracking rate of the current observation frame is 100%;

when the feature point tracking rate of the current observation frame is 100%, sequentially calculating the relative pose between the observation frame to be rejected and the oldest first frame in the window from the oldest second frame in the window to judge whether the relative pose is smaller than a second pose threshold value, and if so, rejecting the observation frame to be rejected; if not, the observation frame to be eliminated is reserved; and

and when the tracking rate of the feature points of the current observation frame is less than 100%, retaining all the observation frames to be eliminated in the window.

In an embodiment of the present invention, the step S130 further includes the steps of:

the number of the observation frames to be rejected which are rejected from the window is monitored, so that the rejection operation is stopped when the rejection number of the observation frames to be rejected reaches 1/3 of the maximum window number.

According to another aspect of the present invention, the present invention further provides a SLAM positioning method, including the steps of:

performing front-end processing on an original image acquired by a binocular camera to obtain feature point information of a current observation frame;

carrying out filter prediction processing on IMU information acquired by an inertial measurement unit to obtain the predicted pose and the predicted speed of the binocular camera;

performing map construction according to the feature point information of the current observation frame to determine whether tracking lost feature point information exists, and further performing filter estimation processing to obtain an estimated pose and an estimated speed of the binocular camera;

based on the feature point information of the current observation frame, performing sliding window processing by a classification sliding window method to determine whether the removed observation frame exists; and

when the removed observation frame exists, performing filter estimation processing on the feature point information in the removed observation frame according to the estimated pose and the estimated speed of the binocular camera to obtain the optimized pose and the optimized speed of the binocular camera; and when the rejected observation frame does not exist, directly taking the estimated pose and the estimated speed of the binocular camera as the optimized pose and the optimized speed of the binocular camera.

In an embodiment of the present invention, the classification sliding window method includes the steps of:

In an embodiment of the present invention, the step of performing front-end processing on the original image acquired by the binocular camera to obtain the feature point information of the current observation frame includes the steps of:

tracking the feature points of the left target image in the original image by an optical flow tracking method to obtain the feature point information of the left target in the current observation frame; and

and tracking the feature points of the right eye image in the original image by an epipolar search and block matching method according to the relative pose between the left eye camera and the right eye camera in the binocular camera so as to obtain the feature point information of the right eye in the current observation frame.

In an embodiment of the present invention, the step of performing front-end processing on the original image acquired by the binocular camera to obtain the feature point information of the current observation frame further includes the steps of:

and judging whether the number of the feature points of the left eye image tracked by the optical flow tracking method is smaller than a feature point number threshold, if so, extracting new feature point information from the left eye image by a feature point extraction method so as to supplement the feature point information of the left eye in the current observation frame.

In an embodiment of the present invention, the step of performing map construction according to the feature point information of the current observation frame to determine whether there is a feature point with a tracking loss, and further obtaining the estimated pose and the estimated speed of the binocular camera through filter estimation processing includes the steps of:

when the tracking-lost feature point exists, carrying out filter estimation processing on the information of the tracking-lost feature point according to the predicted pose and the predicted speed of the binocular camera so as to obtain the estimated pose and the estimated speed of the binocular camera; and

when the tracking-lost feature point does not exist, the predicted pose and the predicted speed of the binocular camera are directly taken as the estimated pose and the estimated speed of the binocular camera.

when the tracking-lost feature points do not exist, a preset number of feature points are screened from the feature points of the current observation frame, and then filtering estimation processing is carried out on the information of the screened feature points according to the predicted pose and the predicted speed of the binocular camera, so as to obtain the estimated pose and the estimated speed of the binocular camera.

According to another aspect of the present invention, there is also provided a classification sliding window system for classifying sliding windows, wherein the classification sliding window system includes:

a determining module, configured to determine whether the number of all observation frames in the window reaches the maximum window number of the window;

a first eliminating module, wherein the first eliminating module is communicably connected to the determining module and is configured to, when the number of the observation frames reaches the maximum window number, eliminate a predetermined number of observation frames in batches from the window at intervals according to a relative pose between an oldest first frame and an oldest second frame in the window; and

a second eliminating module, wherein the second eliminating module is communicably connected to the determining module, and is configured to further determine whether the number of the observation frames is greater than a preset frame number threshold when the number of the observation frames is less than the maximum window number, and if so, selectively eliminate the observation frames from the window according to the feature point tracking rate of the current observation frame; if not, all observation frames in the window are reserved.

In an embodiment of the present invention, the classification sliding window system further includes:

an adding module, wherein the adding module is respectively connected with the first eliminating module and the second eliminating module in a communication way, and is used for adding the current observation frame to the window to be the latest frame in the window.

In an embodiment of the present invention, the first culling module includes a pose calculation module and a batch culling module that are communicably connected to each other, where the pose calculation module is configured to calculate a relative pose between the oldest first frame and the oldest second frame in the window to determine whether the relative pose is greater than a first pose threshold; the batch elimination module is used for eliminating the oldest first frame in the window when the relative pose is larger than the first pose threshold value, and eliminating a first preset number of observation frames in batches at intervals from the oldest second frame in the window; and the batch elimination module is further used for reserving the oldest first frame in the window and batch eliminating a second preset number of observation frames at intervals from the oldest second frame in the window when the relative pose is not larger than the first pose threshold.

In an embodiment of the present invention, the second eliminating module includes a detecting module, a selecting eliminating module and a retaining module, wherein the detecting module is configured to detect a feature point tracking rate of the current observed frame to determine whether the feature point tracking rate of the current observed frame is 100%; the selective elimination module is communicably connected to the detection module and is used for sequentially calculating the relative pose between the observation frame to be eliminated and the oldest first frame in the window from the oldest second frame in the window when the feature point tracking rate of the current observation frame is 100%, so as to judge whether the relative pose is smaller than a second pose threshold value, and if so, eliminating the observation frame to be eliminated; if not, the observation frame to be eliminated is reserved; the reserving module is communicably connected to the detecting module and is used for reserving all the observation frames to be rejected in the window when the feature point tracking rate of the current observation frame is less than 100%.

In an embodiment of the present invention, the second culling module further includes a monitoring module, configured to monitor the number of observation frames to be culled from the window, so as to stop culling when the culling number of the observation frames to be culled reaches 1/3 of the maximum window number.

According to another aspect of the present invention, there is also provided a SLAM locating system for locating from an original image acquired by a binocular camera and IMU information acquired by an inertial measurement unit, wherein the SLAM locating system includes:

the front-end system is used for carrying out front-end processing on the original image to obtain the characteristic point information of the current observation frame;

the filter prediction system is used for carrying out filter prediction processing on the IMU information so as to obtain the predicted pose and the predicted speed of the binocular camera;

the map construction system comprises a map construction module and a characteristic point determination module which are mutually connected in a communication way, wherein the map construction module is respectively connected with the front end system and the filter prediction system in a communication way and used for carrying out map construction according to the characteristic point information of the current observation frame, the characteristic point determination module is used for determining whether the characteristic point information with tracking loss exists or not, and further the estimated pose and the estimated speed of the binocular camera are obtained through filter estimation processing;

a classification sliding window system, which is used for performing sliding window processing by a classification sliding window method based on the characteristic point information of the current observation frame so as to determine whether the removed observation frame exists; and

the filter estimation system is communicably connected to the classification sliding window system and is used for performing filter estimation processing on the feature point information in the rejected observation frames according to the estimated pose and the estimated speed of the binocular camera when the rejected observation frames exist, so as to obtain the optimized pose and the optimized speed of the binocular camera; and when the rejected observation frame does not exist, directly taking the estimated pose and the estimated speed of the binocular camera as the optimized pose and the optimized speed of the binocular camera.

In an embodiment of the present invention, the front-end system includes an optical flow tracking module, an epipolar line search and block matching module, and a judgment and extraction module, which are communicably connected to each other, where the optical flow tracking module is configured to track feature points of a left target image in the original image by an optical flow tracking method to obtain feature point information of the left target image in the current observation frame; the polar line searching and block matching module is used for tracking the feature points of a right eye image in the original image by a polar line searching and block matching method according to the relative pose between a left eye camera and a right eye camera in the binocular camera so as to obtain the feature point information of the right eye image in the current observation frame; the judging and extracting module is used for judging whether the number of the feature points of the left eye image tracked by the optical flow tracking method is smaller than a feature point number threshold value or not, if so, new feature point information is extracted from the left eye image by the feature point extracting method so as to supplement the feature point information of the left eye in the current observation frame.

In an embodiment of the present invention, the filter estimation system is further configured to, when there is the tracking-lost feature point, perform filter estimation processing on information of the tracking-lost feature point according to the predicted pose and the predicted speed of the binocular camera to obtain the estimated pose and the estimated speed of the binocular camera; and when the tracking-lost feature point does not exist, directly taking the predicted pose and the predicted speed of the binocular camera as the estimated pose and the estimated speed of the binocular camera.

In an embodiment of the present invention, the map construction system further includes a feature point screening module, configured to screen a predetermined number of feature points from the feature points of the current observation frame when there is no feature point with the tracking loss; and the filter estimation system is also used for carrying out filtering estimation processing on the information of the screened feature points according to the predicted pose and the predicted speed of the binocular camera so as to obtain the estimated pose and the estimated speed of the binocular camera.

According to another aspect of the present invention, the present invention also provides an electronic device comprising:

at least one processor configured to execute instructions; and

a memory communicatively coupled to the at least one processor, wherein the memory has at least one instruction, wherein the instruction is executable by the at least one processor to cause the at least one processor to perform some or all of the steps of a SLAM location method, wherein the SLAM location method comprises the steps of:

performing map construction according to the feature point information of the current observation frame to determine whether tracking lost feature point information exists, and further obtaining the estimated pose and the estimated speed of the binocular camera through filter estimation processing;

Further objects and advantages of the invention will be fully apparent from the ensuing description and drawings.

These and other objects, features and advantages of the present invention will become more fully apparent from the following detailed description, the accompanying drawings and the claims.

Drawings

Fig. 1 is a flowchart illustrating a classification sliding window method according to an embodiment of the present invention.

Fig. 2 shows a flow chart of one of the steps of the classification sliding window method according to the above-described embodiment of the present invention.

Fig. 3 is a flow chart illustrating a second step of the classification sliding window method according to the above embodiment of the present invention.

Fig. 4 shows an example of the classification sliding window method according to the above-described embodiment of the present invention.

Fig. 5 shows a flowchart of a SLAM positioning method according to an embodiment of the invention.

Fig. 6 is a flow chart illustrating one of the steps of the SLAM positioning method according to the above embodiment of the present invention.

Fig. 7 is a flow chart illustrating a second step of the SLAM locating method according to the above embodiment of the present invention.

Fig. 8 shows an example of the SLAM locating method according to the above-described embodiment of the present invention.

Fig. 9 shows an example of the front-end processing steps of the SLAM positioning method according to the above-described embodiment of the present invention.

FIG. 10 shows a block diagram schematic of a classification sliding window system according to an embodiment of the invention.

Fig. 11 shows a block diagram schematic of a SLAM location system according to an embodiment of the invention.

FIG. 12 shows a block diagram schematic of an electronic device according to an embodiment of the invention.

Detailed Description

The following description is presented to disclose the invention so as to enable any person skilled in the art to practice the invention. The preferred embodiments in the following description are given by way of example only, and other obvious variations will occur to those skilled in the art. The basic principles of the invention, as defined in the following description, may be applied to other embodiments, variations, modifications, equivalents, and other technical solutions without departing from the spirit and scope of the invention.

In the present invention, the terms "a" and "an" in the claims and the description should be understood as meaning "one or more", that is, one element may be one in number in one embodiment, and the element may be more than one in number in another embodiment. The terms "a" and "an" should not be construed as limiting the number unless the number of such elements is explicitly recited as one in the present disclosure, but rather the terms "a" and "an" should not be construed as being limited to only one of the number.

In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In the description of the present invention, it should be noted that, unless explicitly stated or limited otherwise, the terms "connected" and "connected" are to be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be directly connected or indirectly connected through an intermediate. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Currently, the existing filtering-based SLAM positioning method such as S-MSCKF generally employs a visual sensor (such as a binocular camera, etc.) and an Inertial Measurement Unit (IMU) to perform fusion positioning, wherein each input backend observation frame is saved through a window, and when the number of observation frames in the window satisfies a maximum storable number (i.e., a maximum window number), several observation frames in the window need to be deleted to complete the sliding window operation. However, the sliding window measurement adopted by the existing S-MSCKF positioning method is as follows: when a window is full (namely the number of the observation frames in the window is equal to the maximum window number), firstly, starting from the latest penultimate frame, calculating the relative pose between the latest penultimate frame and the latest penultimate frame, and if the relative pose between the latest penultimate frame and the latest penultimate frame meets a certain threshold value, rejecting the latest penultimate frame; if not, the oldest first frame is culled. Then, calculating the relative pose between the latest penultimate frame and the latest penultimate frame, and if the relative pose between the latest penultimate frame and the latest penultimate frame meets a certain threshold, rejecting the latest penultimate frame; if not, the oldest second frame is culled. And finally, inputting the feature points on the two eliminated observation frames into a filter for filtering optimization, thereby obtaining positioning information.

In other words, the sliding window measurement adopted by the existing S-MSCKF positioning method does not take into account the influence of the current observation frame and the oldest two frames, and the elimination strategy is too harsh, so that although the real-time performance and the positioning accuracy of the method are improved compared with those of the EKF-SLAM positioning method, the positioning accuracy still does not meet the requirements of applications such as AR/VR. In addition, in the case of complicated motion conditions, the existing filter-based SLAM positioning method such as S-MSCKF may also suffer from irrecoverable drift, resulting in irreversible errors in the positioning result. Therefore, in order to solve the above problems, the present invention provides a classification sliding window method, a SLAM positioning method, a system thereof, and an electronic device.

Illustrative method

Referring to fig. 1-4 of the drawings, a classification sliding window method according to an embodiment of the present invention is illustrated. Specifically, as shown in fig. 1, the classification sliding window method includes the steps of:

s110: determining whether the number of all observation frames in a window reaches the maximum window number of the window;

s120: when the number of the observation frames reaches the maximum window number, batch removing a preset number of observation frames from the window at intervals according to the relative pose between the oldest first frame and the oldest second frame in the window; and

s130: when the number of the observation frames is smaller than the maximum window number, further determining whether the number of the observation frames is larger than a preset frame number threshold value, if so, selectively removing the observation frames from the window according to the characteristic point tracking rate of the current observation frames; if not, all the observation frames in the window are reserved.

It is noted that the classification sliding window method of the present invention selects different sliding window strategies (i.e. deleting observation frames saved in the window) for different observation information (such as the number of the observation frames in the window, etc.) so as to ensure that the useless information is deleted and at the same time, the useful information is deleted as little as possible.

Further, in the classification sliding window method according to the above-mentioned embodiment of the present invention, the step S100 may be performed when the current observation frame (i.e., the observation frame that is newly input to the backend) is received, that is, when the backend receives the current observation frame, the number of all observation frames already stored in the window is detected to determine and determine whether the number of all observation frames reaches the maximum window number of the window. It is to be understood that, for ease of understanding and to avoid confusion, the present invention defines the first observation frame entering the window (i.e., the first observation frame saved in the window) as the oldest first frame; defining a second observation frame entering the window (i.e., a second observation frame saved in the window) as an oldest second frame; and by analogy, sequencing and defining all observation frames in the window according to the time sequence of entering the window.

It is worth mentioning that, since the classification sliding window method of the present invention is executed after the step S120 and the step S130 are completed, the number of the observation frames stored in the window will be inevitably smaller than the maximum window number of the window, so that the window can continue to store new observation frames; therefore, as shown in fig. 1, the classification sliding window method of the present invention further includes the steps of:

s140: adding the current observation frame to the window as the latest frame in the window.

In other words, the classification sliding window method of the present invention adds the current observation frame to the window after removing part of the observation frames in the window in a classification manner, so as to achieve the whole sliding window effect, so that the observation frames stored in the window are the latest to update the constraint relationship, and the original constraint relationship with strong relevance can be fully retained as much as possible.

It should be noted that, according to the above embodiment of the present invention, when the number of all observation frames reaches the maximum number of windows of the window, a part of observation frames must be removed from the window before the current observation frame can be added into the window. When the relative pose between the oldest first frame and the oldest second frame in the window is larger than a certain threshold, the effect of the oldest first frame on the constraint relation is smaller than the effect of the oldest second frame on the constraint relation, so that the oldest first frame can be removed to reduce the adverse effect caused by removing the observation frame; when the relative pose between the oldest first frame and the oldest second frame in the window is not larger than the threshold, the effect of the oldest first frame on the constraint relationship is larger than the effect of the oldest second frame on the constraint relationship, so that the oldest second frame can be removed, and the adverse effect caused by removing the observation frame can be reduced. Therefore, as shown in fig. 2, the step S120 of the classification sliding window method may include the steps of:

s121: calculating a relative pose between the oldest first frame and the oldest second frame in the window to determine whether the relative pose is greater than a first pose threshold;

s122: when the relative pose is larger than the first pose threshold value, the oldest first frame is removed, and a first preset number of observation frames are removed in batches at intervals from the oldest second frame in the window; and

s123: when the relative pose is not larger than the first pose threshold value, the oldest first frame is reserved, and a second preset number of observation frames are batch-removed at intervals from the oldest second frame in the window.

It is noted that, since the relative pose between the oldest first frame and the oldest second frame in the window may include an angle difference and a distance difference between the oldest first frame and the oldest second frame, the first pose threshold of the present invention may be implemented as a preset relative pose (i.e., a preset angle difference and a preset distance difference). In other words, in the classifying sliding window method of the present invention, if the relative pose calculated by the step S121 is larger than the preset relative pose, that is, when the angle difference and the distance difference calculated by the step S121 are larger than the preset angle difference and the preset distance difference, respectively, the classifying sliding window method performs the step S122; otherwise, the step S123 is executed. It can be understood that the first pose threshold of the present invention can be obtained by debugging the SLAM positioning method, and the range or specific numerical value of the first pose threshold is determined by the accuracy of the positioning information obtained by the SLAM positioning method, which is not described in detail herein.

Preferably, in the step S122 of the classification sliding window method of the present invention, when the relative pose between the oldest first frame and the oldest second frame in the window is greater than the first pose threshold, that is, the angle difference is greater than the preset angle difference, and the distance difference is greater than the preset distance difference, the oldest first frame is removed, and then a first predetermined number of the observation frames are batch removed at equal intervals from the oldest second frame in the window.

It is understood that the first predetermined number may be set, but is not limited to, according to a maximum number of windows of the window. For example, when the maximum number of windows is thirty frames, the first predetermined number may be implemented as ten frames, such that one frame is dropped every other frame from the oldest second frame. This excludes the oldest first frame, the oldest second frame, the oldest fourth frame, and the oldest sixth frame from the window. Of course, in other examples of the present invention, the first predetermined number may be implemented as nine frames or eleven frames, among other numbers.

Similarly, in the step S123 of the classified sliding window method of the present invention, when the relative pose between the oldest first frame and the oldest second frame in the window is not greater than the first pose threshold, that is, the angle difference is not greater than the preset angle difference and/or the distance difference is not greater than the preset distance difference, the oldest first frame is retained, and the observation frames are removed from the oldest second frame in the window in batches at equal intervals, where the number of the observation frames removed from the window is equal to the second predetermined number.

It is understood that the second predetermined number may be set, but is not limited to, according to a maximum number of windows of the window. For example, when the maximum window number of the window is thirty frames, the second predetermined number may be implemented as ten frames, such that one frame is excluded every two frames from the oldest second frame to exclude ten frames of the oldest second frame, the oldest fourth frame, and the oldest sixth frame from the window. Of course, in other examples of the present invention, the second predetermined number may be implemented as nine frames or eleven frames, among other numbers.

It is worth mentioning that, according to the above embodiment of the present invention, when the number of all observation frames does not reach the maximum window number of the window, the current observation frame may be directly added to the window without removing any observation frame. Therefore, in the step S130 of classifying sliding window anti-shake according to the present invention, the number of observation frames in the window may be compared with the preset frame number threshold to determine whether the number of observation frames is greater than the preset frame number threshold, and then whether to remove observation frames from the window is selected according to the comparison result of frame numbers, so as to ensure that a sufficient number of observation frames are stored in the window. It is understood that the preset frame number threshold of the present invention can be set, but is not limited to, according to the maximum window number of the window, so that the number of observation frames in the window is kept as sufficient as possible. For example, the preset frame number threshold may be between 1/3 and 2/3 of the maximum window number (i.e., when the maximum window number is thirty, the preset frame number threshold may be between ten and twenty).

Furthermore, when the number of the observation frames in the window is greater than the preset frame number threshold, the classification sliding window method of the present invention may further determine whether to eliminate the observation frames according to whether the feature point tracking rate of the current observation frame reaches 100%, so as to take the characteristics of the current observation frame into consideration.

Illustratively, as shown in fig. 3, the step S130 of the classification sliding window method of the present invention may include the steps of:

s131: detecting the characteristic point tracking rate of the current observation frame to determine whether the characteristic point tracking rate of the current observation frame is 100%;

s132: when the feature points of the current observation frame are 100%, sequentially calculating the relative pose between the observation frame to be rejected and the oldest first frame in the window from the oldest second frame in the window to judge whether the relative pose is smaller than a second pose threshold value, and if so, rejecting the observation frame to be rejected; if not, the observation frame to be eliminated is reserved; and

s133: and when the tracking rate of the feature points of the current observation frame is less than 100%, reserving all observation frames to be eliminated in the window.

It is noted that in this example of the present invention, other observation frames in the window besides the oldest first frame may be defined as the observation frames to be culled in the window. In this way, the fact that the relative pose between the observation frame to be rejected and the oldest first frame in the window is smaller than the second pose threshold means that the pose between the observation frame to be rejected and the oldest first frame does not change much, so that the original constraint relationship can be kept as much as possible even if the observation frame to be rejected is rejected, as long as the oldest first frame is kept. It can be understood that the second pose threshold value of the present invention can also be obtained by debugging the SLAM positioning method, and the range or the specific numerical value of the second pose threshold value is determined by the accuracy of the positioning information obtained by the SLAM positioning method, which is not described in detail herein.

Preferably, in the step S132, the number of the observation frames to be removed from the window does not exceed 1/3 of the maximum window number, so as to ensure that a sufficient number of observation frames still remain in the window, which helps to ensure that the positioning accuracy is always kept at a high level. For example, when the maximum window number of the window is thirty, in the step S132, the number of the observation frames to be removed from the window is at most ten frames.

In other words, as shown in fig. 3, the step S130 of the classification sliding window method of the present invention may further include the steps of:

s134: monitoring the number of the observation frames to be rejected which are rejected from the window so as to stop the rejection operation when the rejection number of the observation frames to be rejected reaches 1/3 of the maximum window number.

Illustratively, as shown in fig. 4, according to the classification sliding window method of the above-described embodiment of the present invention, when a new observation frame (i.e., a current observation frame) is input to the back end, first, the number of observation frames saved in the window is detected to determine whether the number of observation frames reaches the maximum window number of the window.

Secondly, if so, calculating the relative pose between the oldest first frame and the oldest second frame in the window to judge whether the relative pose is greater than a first pose threshold value; if not, further determining whether the number of the observation frames is larger than a preset frame number threshold.

Then, when the relative pose is larger than the first pose threshold value, the oldest first frame is removed, and a first preset number of observation frames are removed in batches at intervals from the oldest second frame in the window; and when the relative pose is not larger than the first pose threshold value, reserving the oldest first frame, and batch-removing a second preset number of observation frames at intervals from the oldest second frame in the window. Correspondingly, when the number of the observation frames is greater than the preset frame number threshold, selectively removing the observation frames from the window according to the characteristic point tracking rate of the current observation frames; and when the number of the observation frames is not more than the preset frame number threshold, reserving all the observation frames in the window.

Then, detecting the characteristic point tracking rate of the current observation frame to determine whether the characteristic point tracking rate of the current observation frame is 100%; if yes, starting from the oldest second frame in the window, sequentially calculating the relative pose between the observation frame to be rejected and the oldest first frame in the window to judge whether the relative pose is smaller than a second pose threshold value, and rejecting the observation frame to be rejected when the relative pose is smaller than the second pose threshold value, wherein the number of the observation frames to be rejected from the window is 1/3 which is the maximum of the number of the maximum windows; when the relative pose is not smaller than the second pose threshold value, the observation frame to be rejected is reserved; if not, retaining all the observation frames to be eliminated in the window.

Finally, the current observation frame is added to the window as the latest frame in the window (i.e., the oldest first-to-last frame or the latest first frame in the window).

It should be noted that the classification sliding window method of the present invention can effectively eliminate redundant observation frames in the window according to different motion conditions, and reduce the constraint of feature points, so that the calculation amount can be greatly reduced when the same feature points are processed next. Therefore, compared with a positioning method based on nonlinear optimization, the SLAM positioning method can greatly improve the calculation speed and reduce the memory consumption. In addition, in the classification sliding window method, the characteristic points contained in the eliminated observation frames can provide observation information for filtering optimization. Therefore, compared with the S-MSCKF positioning method, the classification sliding window method provided by the invention provides more observation information and constraints for filtering optimization, and the positioning precision is improved.

According to another aspect of the present invention, as shown in fig. 5, the present invention further provides a SLAM positioning method, comprising the steps of:

s210: performing front-end processing on an original image acquired by a binocular camera to obtain feature point information of a current observation frame;

s220: carrying out filter prediction processing on IMU information acquired by an inertial measurement unit to obtain the predicted pose and the predicted speed of the binocular camera;

s230: performing map construction according to the feature point information of the current observation frame to determine whether tracking loss feature points exist or not, and further performing filter estimation processing to obtain an estimated pose and an estimated speed of the binocular camera;

s240: based on the feature point information of the current observation frame, performing sliding window processing by the classification sliding window method to determine whether the removed observation frame exists; and

s250: when the removed observation frame exists, carrying out filter estimation processing on the feature point information in the removed observation frame according to the estimated pose and the estimated speed of the binocular camera so as to obtain the optimized pose and the optimized speed of the binocular camera; and when the rejected observation frames do not exist, directly taking the estimated pose and the estimated speed of the binocular camera as the optimized pose and the optimized speed of the binocular camera.

It is worth noting that the SLAM positioning method can effectively remove redundant observation frames in the window according to different motion conditions through the classification sliding window method, so that the positioning accuracy and the real-time performance of the SLAM positioning method are obviously improved. It can be understood that, in the SLAM positioning method of the present invention, reference may be made to the classification sliding window method in the foregoing embodiment of the present invention in the process of performing sliding window processing by using the classification sliding window method, which is not described again herein.

In addition, in the process of performing front-end processing on the original image (including the left eye image and the right eye image in the original image) acquired by the binocular camera to obtain the feature point information (including the left object feature point information and the right object feature point information in the current observation frame) of the current observation frame, feature point tracking is generally required on the original image information. In the existing S-MSCKF positioning method, feature point tracking of a left eye image is generally performed by using an optical flow tracking method to obtain feature point information of a left eye in the current observation frame, and then feature point tracking of a right eye image is performed by using a stereo matching method to obtain feature point information of a right eye in the current observation frame, so as to obtain feature point information of the current observation frame. However, in the existing S-MSCKF positioning method, the calculation amount for obtaining the feature point information of the current observation frame is large, the front-end processing speed is slow, the time required to be spent is long, and especially, the error of left and right target feature point tracking is large, so that the positioning accuracy and the real-time performance of the existing S-MSCKF positioning method are difficult to meet the requirements of applications such as AR/VR.

Therefore, in order to reduce the error of left and right eye feature point tracking and increase the front-end processing speed, as shown in fig. 6, step S210 of the SLAM positioning method of the present invention may include the steps of:

s211: tracking the feature points of the left target image in the original image by an optical flow tracking method to obtain the feature point information of the left target in the current observation frame; and

s212: and tracking the feature points of the right eye image in the original image by an epipolar search and block matching method according to the relative pose between the left eye camera and the right eye camera in the binocular camera to obtain the feature point information of the right eye image in the current observation frame.

It should be noted that, in the present invention, for a newly received frame of left eye image (i.e. the left eye image in the current original image), the number of feature points of the left eye image tracked by the optical flow tracking method may be reduced, i.e. there may be a feature point with tracking loss. At this time, the feature points need to be supplemented so that the number of feature points reaches the maximum number. Therefore, according to the above embodiment of the present invention, as shown in fig. 6, the step S210 of the SLAM positioning method of the present invention may further include the steps of:

s213: judging whether the number of the feature points of the left eye image tracked by the optical flow tracking method is smaller than a feature point number threshold value, if so, extracting new feature point information from the left eye image by a feature point extraction method so as to supplement the feature point information of the left eye in the current observation frame.

Further, in an example of the present invention, as shown in fig. 7, the step S230 of the SLAM positioning method may include the steps of:

s231: when the tracking-lost feature points exist, carrying out filter estimation processing on the information of the tracking-lost feature points according to the predicted pose and the predicted speed of the binocular camera so as to obtain the estimated pose and the estimated speed of the binocular camera; and

s232: and when the tracking-lost feature points do not exist, directly taking the predicted pose and the predicted speed of the binocular camera as the estimated pose and the estimated speed of the binocular camera.

It is to be noted that, in the above example of the present invention, only when there is the tracking missing feature point information, the tracking missing feature point information is subjected to filter estimation processing to obtain an estimated pose and an estimated speed with higher accuracy; and when the tracking lost feature point information does not exist, filter estimation processing is not carried out, so that the accuracy of the estimated pose and the estimated speed of the binocular camera is poor. Therefore, in order to solve the above problem, in another example of the present invention, as shown in fig. 7, the step S230 of the SLAM positioning method may further include the steps of:

s232': when the tracking-lost feature points do not exist, a preset number of feature points are screened from the feature points of the current observation frame, and then filter estimation processing is carried out on the information of the screened feature points according to the predicted pose and the predicted speed of the binocular camera, so that the estimated pose and the estimated speed of the binocular camera are obtained.

It can be understood that, in this example of the present invention, when there is no feature point information with lost tracking, that is, when the feature point tracking rate of the current observation frame is 100%, a predetermined number of feature points are still selected for the filter estimation processing, so that the accuracy of the estimated pose and the estimated speed of the binocular camera is improved. Further, the predetermined number of the present invention may be designed according to the maximum tracing number of the feature points, for example, the predetermined number of the present invention may be, but is not limited to 1/10 implemented as the maximum tracing number.

More specifically, in order to further improve the accuracy of the estimated pose and the estimated speed of the binocular camera, and further improve the positioning accuracy of the SLAM positioning method, in step S232' of the SLAM positioning method of the present invention, it is preferable to screen out feature points whose left and right matching errors are smaller than a predetermined threshold value from the feature point information of the current observation frame by a feature point screen.

Illustratively, the feature point screen of the present invention may be implemented as, but is not limited to:

wherein p is₁And p₂Respectively are coordinates of left and right matching feature points; t is translation amount; r is the rotation amount.

It is to be noted that the left and right matching errors of the feature points of the present invention

Theoretically: if p is₁And p₂Is precisely matched with

Due to noise, tracking error, etc., the method can be used for the detection of the tracking error

Not equal to 0, but

The closer to 0, p₁And p₂The higher the left-right matching degree between them.

Preferably, the predetermined threshold of the present invention may be, but is not limited to being, implemented as

Wherein s is a coefficient; c. C_xAnd c_yAre internal parameters of the binocular camera respectively.

Thus, in step S232', the feature point with the best left-right matching degree, that is, the feature point with the best tracking effect, can be screened from the feature point information of the current observation frame through the feature point screening device, so as to improve the accuracy of the estimated pose and the estimated speed of the binocular camera to the maximum extent, and further improve the positioning accuracy of the SLAM positioning method to the maximum extent.

Illustratively, in an example of the present invention, as shown in fig. 8, the SLAM locating method of the present invention may include the following steps:

step 1: system initialization and feature extraction

Initializing the whole system, obtaining internal and external parameters of a camera and IMU initial parameters required by the system, receiving information of a visual sensor, filtering original image information of the visual sensor, establishing a two-layer pyramid for an image, extracting feature points from the top layer to the bottom layer of the pyramid, accelerating the extraction speed of the feature points under the condition that the maximum number of the feature points is fixed, sorting by using harris response values, selecting the feature points with high response values, and outputting the features.

Step 2: feature tracking and matching

As shown in fig. 9, first, the features of the left eye image are extracted, the relative poses of the left and right eye cameras are utilized, feature tracking is performed by a polar line search and block matching method, and the tracking result is input to the rear end; then, carrying out optical flow tracking on a new frame of left eye image to obtain a feature point of a new frame of image; if the number of the feature points is relatively small, sufficient features are extracted by a feature extraction method for supplement, the maximum tracking number of the feature points is met, the tracked feature points are input to the rear end, and front-end processing is completed.

And step 3: IMU initialization and pre-integration, filter initialization

The initialization of the IMU adopts static initialization, determines the direction of the gravitational acceleration and provides the direction for initializing the camera. The IMU data needs to be pre-integrated as the predicted value of EKF, wherein the pre-integration method can be but is not limited to adopting 4-step Runge Kutta algorithm.

The initialization of the filter is to set initial values of parameters of the filter, particularly initial values of a covariance matrix and system noise, which play an important role in the filtering accuracy. The specific process is as follows: firstly, establishing a continuous IMU error model; secondly, discretizing the matrixes F and G; then, predicting the IMU covariance at the current moment by using the covariance at the previous moment; and finally, performing considerable consistency correction on the covariance prediction equation of the system.

And 4, step 4: camera state synchronization with IMU and covariance augmentation

When new camera information (namely feature point information of a current observation frame) is input into the rear end, the IMU is required to predict the current pose through pre-integration, the relative poses of the IMU and the camera are used for calculating the poses of the camera, and the poses of the two sensors are synchronized. When the system adds cameras, the covariance needs to be augmented.

And 5: constructing a map and processing the feature points

The visual information is received, processed by the front end and input to the back end, and the characteristic points are established into a local map for constraining the characteristic points. In the process of tracking the characteristic points, the loss of the characteristic points is easy to happen, so when the loss of the characteristic points happens, EKF (extended Kalman Filter) updating is carried out by using the lost characteristic points, and an optimized pose is output.

Step 6: sliding window

Because the feature points are constrained by multiple frames, in order to continuously update the constraint relation and ensure the stability and the real-time performance of the algorithm, a sliding window is required to be adopted to remove some frames, and the constraint relation is updated and the constraint can be reduced. Compared with the sliding window strategy in algorithms such as VINS, ORB, ICE-BA, S-MSCKF and the like, the invention provides a new sliding window strategy, and different sliding window methods are started aiming at different observation information, and the specific operation is as follows:

(1) when the number of observation frames in a window meets a frame number threshold value but is smaller than the maximum window number and the tracking rate of the current frame feature points is 100%, namely the current frame has no missing feature points, calculating an angle difference and a distance difference between the current frame and the oldest first frame from the oldest second frame in the window, and if the number of observation frames in the window meets the threshold value, rejecting the oldest second frame; and in turn, 1/3 of which the number of the observation frames is the maximum window number is eliminated at most.

(2) When the number of observation frames in a window meets the maximum window number, calculating the distance difference and the angle difference between the oldest first frame and the oldest second frame, and if the distance difference and the angle difference meet a threshold value, rejecting the oldest first frame; if not, the oldest first frame is not culled. Then, ten frames are continuously and equally removed from the oldest second frame in the window.

And 7: system updates

The main task of system updating is to firstly utilize the predicted values of the current time state and covariance obtained in the prediction module; and then, constructing a measurement model by utilizing the screened characteristic points, and filtering the two kinds of information through an extended Kalman filtering algorithm to obtain an estimated value of the current moment. It is worth noting that the SLAM positioning method obtains the current pose estimation after being updated through the EKF, and the EUROC data set test result shows that the positioning accuracy is greatly improved.

Illustrative System

Referring to FIG. 10 of the drawings, a classification sliding window system according to an embodiment of the invention is illustrated. Specifically, as shown in fig. 10, the classification sliding window system 10 is used for classifying and sliding a window, and may include a determining module 11, a first culling module 12, and a second culling module 13, which are communicatively connected to each other, where the determining module 11 is configured to determine whether the number of all observation frames in the window reaches the maximum window number of the window; the first culling module 12 is communicatively connected to the determining module 11, and configured to, when the number of observation frames reaches the maximum window number, batch cull a predetermined number of observation frames from the window at intervals according to a relative pose between an oldest first frame and an oldest second frame in the window; the second eliminating module 13 is communicatively connected to the determining module 11, and is configured to further determine whether the number of the observation frames is greater than a preset frame number threshold when the number of the observation frames is less than the maximum window number, and if so, selectively eliminate the observation frames from the window according to the feature point tracking rate of the current observation frame; if not, all the observation frames in the window are reserved.

It should be noted that, in the above embodiment of the present invention, as shown in fig. 10, the classifying sliding window system 10 may further include an adding module 14, where the adding module 14 is communicably connected to the first culling module 12 and the second culling module 13, respectively, for adding the current observation frame to the window as a latest frame in the window.

Still further, in an example of the present invention, as shown in fig. 10, the first culling module 12 may include a one-position posture calculation module 121 and a batch culling module 122 communicably connected to each other. The pose calculation module 121 is configured to calculate a relative pose between the oldest first frame and the oldest second frame in the window to determine whether the relative pose is greater than a first pose threshold. The batch elimination module 122 is configured to eliminate the oldest first frame when the relative pose is greater than the first pose threshold, and eliminate a first predetermined number of observation frames in batches at intervals from the oldest second frame in the window; and when the relative pose is not greater than the first pose threshold, retaining the oldest first frame, and batch-removing a second preset number of observation frames at intervals from the oldest second frame in the window.

In an example of the present invention, as shown in fig. 10, the second culling module 13 may include a detecting module 131, a selective culling module 132, and a retaining module 133, where the detecting module 131 is configured to detect a feature point tracking rate of the current observed frame to determine whether the feature point tracking rate of the current observed frame is 100%; the selective elimination module 132 is communicably connected to the detection module 131, and configured to, when the feature points of the current observation frame are 100%, sequentially calculate, starting from an oldest second frame in the window, a relative pose between the observation frame to be eliminated and the oldest first frame in the window to determine whether the relative pose is smaller than a second pose threshold, and if so, eliminate the observation frame to be eliminated; if not, the observation frame to be eliminated is reserved; the reserving module 133 is communicably connected to the detecting module 131, and configured to reserve all the observation frames to be rejected in the window when the feature point tracking rate of the current observation frame is less than 100%.

Preferably, as shown in fig. 10, the second culling module 13 further includes a monitoring module 134, wherein the monitoring module 134 is configured to monitor the number of observation frames to be culled from the window, and stop the culling operation when the number of observation frames to be culled from the window reaches 1/3 of the maximum window number.

According to another aspect of the present invention, as shown in fig. 11, an embodiment of the present invention further provides a SLAM positioning system 1. Specifically, as shown in fig. 11, the SLAM positioning system 1 includes the classification sliding window system 10, a front-end system 20, a filter prediction system 30, a mapping system 40, and a filter estimation system 50. The front-end system 20 is configured to perform front-end processing on an original image acquired by a binocular camera to obtain feature point information of a current observation frame. The filter prediction system 30 is configured to perform filter prediction processing on IMU information acquired by the inertial measurement unit to obtain a predicted pose and a predicted speed of the binocular camera. The mapping system 40 may include a mapping component module 41 and a feature point determination module 42 communicatively connected, wherein the mapping component module 41 is communicatively connected with the front-end system 20 and the filter prediction system 30, respectively, for mapping according to the feature point information of the current observation frame; the feature point determination module 42 is configured to determine whether there is a feature point with a tracking loss, and then perform estimation processing through the filter to obtain an estimated pose and an estimated speed of the binocular camera. The classification sliding window system 10 is configured to perform classification sliding window processing by the classification sliding window method based on the feature point information of the current observation frame, so as to determine whether there is a removed observation frame. The filter estimation system 50 is communicably connected to the classification sliding window system 10, and is further configured to, when the rejected observation frames exist, perform filter estimation processing on feature point information in the rejected observation frames according to the estimated pose and the estimated speed of the binocular camera, so as to obtain an optimized pose and an optimized speed of the binocular camera; and when the rejected observation frames do not exist, directly taking the estimated pose and the estimated speed of the binocular camera as the optimized pose and the optimized speed of the binocular camera.

It should be noted that, in an embodiment of the present invention, as shown in fig. 11, the front-end system 20 may include an optical flow tracking module 21 and an epipolar search and block matching module 22, which are communicably connected to each other, wherein the optical flow tracking module 21 is configured to track feature points of a left target image in a current original image by an optical flow tracking method to obtain left target feature point information in the current observation frame; the epipolar search and block matching module 22 is configured to track feature points of a right eye image in the current original image by an epipolar search and block matching method according to a relative pose between a left eye camera and a right eye camera in the binocular cameras, so as to obtain information of the feature points of the right eye in the current observation frame.

Preferably, as shown in fig. 11, the front-end system 20 may further include a judgment extraction module 23, where the judgment extraction module 23 is configured to judge whether the number of feature points of the left eye image tracked by the optical flow tracking method is less than a feature point number threshold, and if so, extract new feature point information from the left eye image in the current original image by the feature point extraction method to supplement the left eye feature point information in the current observation frame.

In an example of the present invention, as shown in fig. 11, the filter estimation system 50 may be further configured to, when there is the tracking-lost feature point, perform filter estimation processing on the tracking-lost feature point information according to the predicted pose and the predicted speed of the binocular camera to obtain an estimated pose and an estimated speed of the binocular camera; and when the tracking-lost feature points do not exist, directly taking the predicted pose and the predicted speed of the binocular camera as the estimated pose and the estimated speed of the binocular camera.

In another example of the present invention, as shown in fig. 11, the mapping system 40 may further include a feature point screening module 43, configured to screen a predetermined number of feature points from the feature point information of the current observation frame when there is no feature point with the tracking loss; the filter estimation system 50 is further configured to perform filter estimation processing on the screened feature points according to the predicted pose and the predicted speed of the binocular camera to obtain an estimated pose and an estimated speed of the binocular camera.

Illustrative electronic device

Next, an electronic apparatus according to an embodiment of the present invention is described with reference to fig. 12. As shown in fig. 12, the electronic device 90 includes one or more processors 91 and a memory 92.

The processor 91 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 90 to perform desired functions. In other words, the processor 91 comprises one or more physical devices configured to execute instructions. For example, the processor 91 may be configured to execute instructions that are part of: one or more applications, services, programs, routines, libraries, objects, components, data structures, or other logical constructs. Such instructions may be implemented to perform a task, implement a data type, transform the state of one or more components, implement a technical effect, or otherwise arrive at a desired result.

The processor 91 may include one or more processors configured to execute software instructions. Additionally or alternatively, the processor 91 may include one or more hardware or firmware logic machines configured to execute hardware or firmware instructions. The processors of the processor 91 may be single core or multicore, and the instructions executed thereon may be configured for serial, parallel, and/or distributed processing. The various components of the processor 91 may optionally be distributed over two or more separate devices, which may be remotely located and/or configured for coordinated processing. Aspects of the processor 91 may be virtualized and executed by remotely accessible networked computing devices configured in a cloud computing configuration.

The memory 92 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer readable storage medium and executed by the processor 11 to implement some or all of the steps of the above-described exemplary methods of the present invention described above, and/or other desired functions.

In other words, the memory 92 comprises one or more physical devices configured to hold machine-readable instructions executable by the processor 91 to implement the methods and processes described herein. In implementing these methods and processes, the state of the memory 92 may be transformed (e.g., to hold different data). The memory 92 may include removable and/or built-in devices. The memory 92 may include optical memory (e.g., CD, DVD, HD-DVD, blu-ray disc, etc.), semiconductor memory (e.g., RAM, EPROM, EEPROM, etc.), and/or magnetic memory (e.g., hard-disk drive, floppy-disk drive, tape drive, MRAM, etc.), among others. The memory 92 may include volatile, nonvolatile, dynamic, static, read/write, read-only, random-access, sequential-access, location-addressable, file-addressable, and/or content-addressable devices.

It is understood that the memory 92 comprises one or more physical devices. However, aspects of the instructions described herein may alternatively be propagated by a communication medium (e.g., an electromagnetic signal, an optical signal, etc.) that is not held by a physical device for a limited period of time. Aspects of the processor 91 and the memory 92 may be integrated together into one or more hardware logic components. These hardware logic components may include, for example, Field Programmable Gate Arrays (FPGAs), program and application specific integrated circuits (PASIC/ASIC), program and application specific standard products (PSSP/ASSP), system on a chip (SOC), and Complex Programmable Logic Devices (CPLDs).

In one example, as shown in FIG. 12, the electronic device 90 may also include an input device 93 and an output device 94, which may be interconnected via a bus system and/or other form of connection mechanism (not shown). The input device 93 may be, for example, a camera module or the like for capturing image data or video data. As another example, the input device 93 may include or interface with one or more user input devices such as a keyboard, mouse, touch screen, or game controller. In some embodiments, the input device 93 may include or interface with a selected Natural User Input (NUI) component. Such component parts may be integrated or peripheral and the transduction and/or processing of input actions may be processed on-board or off-board. Example NUI components may include a microphone for speech and/or voice recognition; infrared, color, stereo display and/or depth camera for machine vision and/or gesture recognition; a head tracker, eye tracker, accelerometer and/or gyroscope for motion detection and/or intent recognition; and an electric field sensing component for assessing brain activity and/or body movement; and/or any other suitable sensor.

The output device 94 may output various information including the classification result and the like to the outside. The output devices 94 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.

Of course, the electronic device 90 may further comprise the communication means, wherein the communication means may be configured to communicatively couple the electronic device 90 with one or more other computer devices. The communication means may comprise wired and/or wireless communication devices compatible with one or more different communication protocols. As a non-limiting example, the communication subsystem may be configured for communication via a wireless telephone network or a wired or wireless local or wide area network. In some embodiments, the communications device may allow the electronic device 90 to send and/or receive messages to and/or from other devices via a network such as the internet.

It will be appreciated that the configurations and/or approaches described herein are exemplary in nature, and that these specific embodiments or examples are not to be considered in a limiting sense, because numerous variations are possible. The specific routines or methods described herein may represent one or more of any number of processing strategies. As such, various acts illustrated and/or described may be performed in the sequence illustrated and/or described, in other sequences, in parallel, or omitted. Also, the order of the above-described processes may be changed.

Of course, for the sake of simplicity, only some of the components of the electronic device 90 relevant to the present invention are shown in fig. 12, and components such as buses, input/output interfaces, and the like are omitted. In addition, the electronic device 90 may include any other suitable components, depending on the particular application.

Illustrative computing program product

In addition to the above-described methods and apparatus, embodiments of the present invention may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the methods according to various embodiments of the present invention described in the "exemplary methods" section above of this specification.

The computer program product may write program code for carrying out operations for embodiments of the present invention in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the C language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, an embodiment of the present invention may also be a computer-readable storage medium having stored thereon computer program instructions, which, when executed by a processor, cause the processor to perform the steps of the above-described method of the present specification.

The computer readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The basic principles of the present invention have been described above with reference to specific embodiments, but it should be noted that the advantages, effects, etc. mentioned in the present invention are only examples and are not limiting, and the advantages, effects, etc. must not be considered to be possessed by various embodiments of the present invention. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the invention is not limited to the specific details described above.

The block diagrams of devices, apparatuses, systems involved in the present invention are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

It should also be noted that in the apparatus, devices and methods of the present invention, the components or steps may be broken down and/or re-combined. These decompositions and/or recombinations are to be regarded as equivalents of the present invention.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

It will be appreciated by persons skilled in the art that the embodiments of the invention described above and shown in the drawings are given by way of example only and are not limiting of the invention. The objects of the invention have been fully and effectively accomplished. The functional and structural principles of the present invention have been shown and described in the examples, and any variations or modifications of the embodiments of the present invention may be made without departing from the principles.

Claims

1. A method of classifying sliding windows, comprising the steps of:

2. The classification sliding window method of claim 1, further comprising the steps of:

3. The classification sliding window method according to claim 2, wherein the step S120 comprises the steps of:

4. The classification sliding window method of claim 3, wherein in the step S120, the observation frames in the window are batch-culled at equal intervals, starting from the oldest second frame in the window.

5. The classification sliding window method according to any one of claims 1 to 4, wherein the step S130 comprises the steps of:

6. The classification sliding window method of claim 5, wherein the step S130 further comprises the steps of:

A SLAM positioning method, comprising the steps of:

8. The SLAM positioning method of claim 7, wherein the classification sliding window method comprises the steps of:

9. The SLAM locating method according to claim 7 or 8, wherein the step of front-end processing the original image acquired by the binocular camera to obtain the feature point information of the current observation frame comprises the steps of:

10. The SLAM locating method of claim 9, wherein the step of front-end processing the original image acquired by the binocular camera to obtain the feature point information of the current observation frame, further comprises the steps of:

11. The SLAM positioning method of claim 10, wherein the step of performing map construction according to the feature point information of the current observation frame to determine whether there is a feature point with a tracking loss, and further obtaining the estimated pose and the estimated speed of the binocular camera through filter estimation processing comprises the steps of:

12. The SLAM positioning method of claim 10, wherein the step of performing map construction according to the feature point information of the current observation frame to determine whether there is a feature point with a tracking loss, and further obtaining the estimated pose and the estimated speed of the binocular camera through filter estimation processing comprises the steps of:

13. A classification sliding window system for classifying sliding windows, wherein the classification sliding window system comprises:

14. The classification sliding window system of claim 13, further comprising:

15. The classification sliding window system of claim 14, wherein the first culling module comprises a pose calculation module and a batch culling module communicatively coupled to each other, wherein the pose calculation module is configured to calculate a relative pose between the oldest first frame and the oldest second frame in the window to determine whether the relative pose is greater than a first pose threshold; the batch elimination module is used for eliminating the oldest first frame in the window when the relative pose is larger than the first pose threshold value, and eliminating a first preset number of observation frames in batches at intervals from the oldest second frame in the window; and the batch elimination module is further used for reserving the oldest first frame in the window and batch eliminating a second preset number of observation frames at intervals from the oldest second frame in the window when the relative pose is not larger than the first pose threshold.

16. The classification sliding window system according to any one of claims 13 to 15, wherein the second culling module comprises a detecting module, a selective culling module and a retaining module, wherein the detecting module is configured to detect the feature point tracking rate of the current observation frame to determine whether the feature point tracking rate of the current observation frame is 100%; the selective elimination module is communicably connected to the detection module and is used for sequentially calculating the relative pose between the observation frame to be eliminated and the oldest first frame in the window from the oldest second frame in the window when the feature point tracking rate of the current observation frame is 100%, so as to judge whether the relative pose is smaller than a second pose threshold value, and if so, eliminating the observation frame to be eliminated; if not, the observation frame to be eliminated is reserved; the reserving module is communicably connected to the detecting module and is used for reserving all the observation frames to be rejected in the window when the feature point tracking rate of the current observation frame is less than 100%.

17. The classification sliding window system of claim 16, wherein the second culling module further comprises a monitoring module for monitoring the number of observation frames to be culled from the window to stop culling when the culling number of observation frames to be culled reaches 1/3 of the maximum window number.

A SLAM positioning system for positioning based on an original image acquired by a binocular camera and IMU information acquired by an inertial measurement unit, wherein the SLAM positioning system comprises:

19. The SLAM localization system of claim 18, wherein the front-end system comprises an optical flow tracking module, an epipolar search and block matching module, and a decision extraction module communicatively connected to each other, wherein the optical flow tracking module is configured to track feature points of the left eye image in the original image by an optical flow tracking method to obtain feature point information of the left eye image in the current observation frame; the polar line searching and block matching module is used for tracking the feature points of a right eye image in the original image by a polar line searching and block matching method according to the relative pose between a left eye camera and a right eye camera in the binocular camera so as to obtain the feature point information of the right eye image in the current observation frame; the judging and extracting module is used for judging whether the number of the feature points of the left eye image tracked by the optical flow tracking method is smaller than a feature point number threshold value or not, if so, new feature point information is extracted from the left eye image by the feature point extracting method so as to supplement the feature point information of the left eye in the current observation frame.

20. The SLAM locating system of claim 19 wherein the filter estimation system is further configured to, when there is the tracking-lost feature point, filter estimate processing information of the tracking-lost feature point based on the predicted pose and the predicted velocity of the binocular camera to obtain the estimated pose and the estimated velocity of the binocular camera; and when the tracking-lost feature point does not exist, directly taking the predicted pose and the predicted speed of the binocular camera as the estimated pose and the estimated speed of the binocular camera.

21. The SLAM locating system of claim 19 wherein the mapping system further comprises a feature point screening module for screening a predetermined number of feature points from the feature points of the current observation frame when there is no feature point with the tracking loss; and the filter estimation system is also used for carrying out filtering estimation processing on the information of the screened feature points according to the predicted pose and the predicted speed of the binocular camera so as to obtain the estimated pose and the estimated speed of the binocular camera.

22. An electronic device, comprising:

at least one processor configured to execute instructions; and