CN111753799A

CN111753799A - Based on initiative dual-purpose vision sensor and robot

Info

Publication number: CN111753799A
Application number: CN202010630565.3A
Authority: CN
Inventors: 张宏辉; 章颖
Original assignee: Shenzhen Muxin Intelligent Technology Co ltd
Current assignee: Shenzhen Muxin Intelligent Technology Co ltd
Priority date: 2020-07-03
Filing date: 2020-07-03
Publication date: 2020-10-09
Anticipated expiration: 2040-07-03
Also published as: CN111753799B

Abstract

The invention discloses a vision sensor based on active dual-purpose, which comprises: the system comprises a binocular camera shooting assembly, a laser projection assembly and a microprocessor, wherein the microprocessor is in communication connection with the binocular camera shooting assembly and the laser projection assembly respectively; the microprocessor is used for controlling the laser projection assembly to be started at intervals, so that the binocular camera shooting assembly alternately collects a first binocular view containing the projection texture and a second binocular view not containing the projection texture, and by using an interval type texture projection mode, visual positioning and 3D depth measurement can be achieved simultaneously, the robot system only needs to be provided with one visual sensor, and the hardware cost of the robot is reduced.

Description

Based on initiative dual-purpose vision sensor and robot

Technical Field

The invention relates to the technical field of computer vision, in particular to a vision sensor based on active dual purposes and a robot.

Background

Depth measurement and visual positioning are two key technologies of a ground mobile intelligent robot, and are key factors influencing the application and performance of the robot field. In the current common ground mobile robot system, depth measurement and visual positioning are usually realized by using different visual sensors. However, the use of different vision sensors to achieve 3D depth measurement and visual positioning may result in increased hardware costs for the robotic system.

Disclosure of Invention

The invention aims to solve the technical problem of providing a vision sensor and a robot based on active dual-purpose aiming at the defects of the prior art.

In order to solve the above technical problem, a first aspect of embodiments of the present invention provides an active dual-purpose based vision sensor, where the vision sensor includes:

the binocular camera shooting assembly is used for acquiring binocular views of shooting scenes;

the laser projection assembly is used for emitting projection textures which can be sensed by the binocular camera assembly; and the number of the first and second groups,

the microprocessor is in communication connection with the binocular camera shooting assembly and the laser projection assembly respectively; the microprocessor is used for controlling the laser projection assembly to be started at intervals so that the binocular camera assembly alternately acquires a first binocular view and a second binocular view, wherein the first binocular view comprises the projection texture, and the second binocular view does not comprise the projection texture.

The active-binocular based vision sensor is characterized in that the microprocessor is further used for determining a depth map based on the first binocular view and determining positioning information corresponding to the depth map based on a second binocular view corresponding to the depth map and the depth map.

The active-binocular-based vision sensor is configured to determine a second binocular view corresponding to the depth map, where the second binocular view is located before the first binocular view corresponding to the depth map in time sequence, and the second binocular view corresponding to the depth map is adjacent to the first binocular view corresponding to the depth map in time sequence.

The active-binocular-based vision sensor, wherein the determining, based on the second binocular view corresponding to the depth map and the depth map, the positioning information corresponding to the depth map specifically includes:

for each pixel point in the depth map, determining a first projection point of the pixel point on a left view and a second projection point on a right view in a second binocular view;

determining a first image block corresponding to the first projection point and a second image block corresponding to the second projection point so as to obtain an image block difference value corresponding to the pixel point;

and determining positioning information corresponding to the depth map based on the image block difference value corresponding to each pixel point in the depth map.

The vision sensor based on active dual-purpose, wherein the determining of the positioning information corresponding to the depth map based on the image block difference value corresponding to each pixel point in the depth map specifically includes:

determining a change parameter of the depth map relative to the second binocular view based on the image block difference value corresponding to each pixel point in the depth map;

and determining the positioning information corresponding to the depth map based on the variation parameter and the positioning information corresponding to the second binocular view.

The active dual purpose based vision sensor, wherein the vision sensor comprises:

and the infrared light supplementing assembly is used for sending infrared light to supplement light for the binocular camera shooting assembly.

The vision sensor based on active double purposes is characterized in that when the laser projection assembly is in a starting state, the infrared light supplementing assembly is in a closing state.

The binocular camera shooting assembly comprises a first image collector and a second image collector, the first image collector and the second image collector are arranged at intervals, and the laser projection assembly is arranged between the first image collector and the second image collector.

A second aspect of embodiments of the present invention provides a robot equipped with an active binocular based vision sensor as described in any one of the above.

Has the advantages that: compared with the prior art, the invention provides an active dual-purpose based vision sensor, which comprises: the system comprises a binocular camera shooting assembly, a laser projection assembly and a microprocessor, wherein the microprocessor is in communication connection with the binocular camera shooting assembly and the laser projection assembly respectively; the microprocessor is used for controlling the laser projection assembly to be started at intervals, so that the binocular camera shooting assembly alternately collects a first binocular view containing the projection texture and a second binocular view not containing the projection texture, and by using an interval type texture projection mode, visual positioning and 3D depth measurement can be achieved simultaneously, the robot system only needs to be provided with one visual sensor, and the hardware cost of the robot is reduced.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without any inventive work.

Fig. 1 is an exemplary diagram of an active dual-purpose based vision sensor provided by the present invention.

FIG. 2 is an exemplary diagram of projecting textures provided by the present invention.

Fig. 3 is a flowchart illustrating the operation of a microprocessor in the active dual-purpose based vision sensor according to the present invention.

Fig. 4 is a matching example diagram of a left view and a right view in the active binocular based vision sensor provided by the present invention.

Fig. 5 is an exemplary diagram of a visual positioning projection relationship in an active binocular based visual sensor provided by the present invention.

Detailed Description

The invention provides an active dual-purpose based vision sensor and a robot, and in order to make the purpose, technical scheme and effect of the invention clearer and clearer, the invention is further described in detail by referring to the attached drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

The inventor finds that the vision sensors widely applied in the robot field mainly include passive 3D vision sensors (for example, a binocular 3D vision sensor, a multi-view 3D vision sensor, and the like, wherein the binocular 3D vision sensor is a stereoscopic vision system composed of two cameras, and the multi-view 3D vision sensor is a stereoscopic vision system composed of more than two cameras); active 3D vision sensors and structured light 3D vision sensors.

The passive 3D vision sensor adopts a simulation human eye design principle, namely two viewpoints observe the same scene to obtain perception images under different visual angles, and then calculates the position deviation (parallax) of the images through a triangulation principle to obtain the three-dimensional position information of the scene. The passive 3D vision sensor does not need an additional light-emitting system and can be used for realizing vision positioning, but the passive 3D vision sensor has low 3D depth measurement precision and is easily influenced by environmental textures and illumination.

The active 3D vision sensor is an active 3D vision depth measurement that enables stereo measurement by capturing projected texture images by a camera, actively projects texture information to the surface of a measurement object through an optical device, and enables stereo measurement by capturing projected texture images by a camera. The active 3D vision sensor solves the problem that the passive 3D vision sensor is easily influenced by environment textures and illumination through actively projecting texture images. However, since the texture image needs to be actively projected, the scene texture information in the visual field is covered by the actively projected texture, so that the scene texture information cannot be captured, and the visual positioning cannot be realized by using the scene texture information.

The structured light 3D vision sensor obtains depth information through triangulation distance measurement by projecting coded special textures onto the surface of a measurement object and capturing projected texture maps through a camera and utilizing the relative position relationship between a projector and the camera. The structured light 3D vision sensor only needs one camera to capture the projected texture map, the measurement accuracy of the structured light 3D vision sensor depends on the encoding accuracy of the texture map to a great extent, and a complex optical system is needed to realize high-accuracy texture encoding, so that the cost is high. Meanwhile, as the texture image needs to be actively projected, the actively projected texture covers the scene texture information in the visual field, so that the scene texture information cannot be captured, and the visual positioning cannot be realized by utilizing the scene texture information.

However, 3D depth measurement and visual localization are two key technologies of a ground mobile intelligent robot, and are key factors affecting the application and performance of the robot field. The ground mobile intelligent robot needs to perform 3D depth measurement and vision positioning, so that two different vision sensors need to be arranged on a ground mobile robot system, one vision sensor is used for 3D depth measurement, the other vision sensor is used for vision positioning, the hardware cost of the robot system can be increased on the one hand, in addition, the cooperation of the different vision sensors is needed (for example, an environment map is constructed, the vision positioning and the 3D depth measurement information need to be used simultaneously, the position and time relation between the two vision sensors need to be strictly calibrated, and the like), and the control complexity of the robot system is increased.

In order to solve the above problems, in the embodiment of the present invention, the vision sensor includes a binocular camera shooting assembly, a laser projection assembly and a microprocessor, and the microprocessor is in communication connection with the binocular camera shooting assembly and the laser projection assembly respectively; the microprocessor is used for controlling the laser projection assembly to be started at intervals, so that the binocular camera shooting assembly alternately collects a first binocular view containing the projection texture and a second binocular view not containing the projection texture, and by using an interval type texture projection mode, visual positioning and 3D depth measurement can be achieved simultaneously, the robot system only needs to be provided with one visual sensor, and the hardware cost of the robot is reduced.

The invention will be further explained by the description of the embodiments with reference to the drawings.

This implementation provides a vision sensor based on initiative is two, as shown in fig. 1, the vision sensor includes binocular camera shooting component 10, laser projection subassembly 20 and microprocessor, microprocessor respectively with binocular camera shooting component 10 and laser projection subassembly 20 communication connection. The binocular camera assembly 10 is used for collecting binocular images within the shooting range of the binocular camera assembly, the laser projection assembly 20 is used for emitting projection textures, the projection textures can be sensed by the binocular camera assembly 10, the microprocessor is used for controlling the laser projection assembly 20 to be started at intervals, so that the binocular camera assembly 10 alternately collects a first binocular view containing the projection textures and a second binocular view not containing the projection textures, and thus the laser projection assembly 20 is controlled to project the textures in an interval mode, so that the binocular camera assembly 10 can acquire binocular images containing the projection textures to perform 3D depth measurement and acquire binocular images not containing the projection textures to perform visual positioning, and therefore the robot system can only need to be provided with one visual sensor under the condition that 3D depth measurement and visual positioning are required, the hardware cost of the robot is reduced.

Further, the binocular camera assembly 10 includes a first image collector and a second image collector; the binocular image comprises a first image collected by the first image collector and a second image collected by the second image collector. In one implementation of the embodiment, the shooting direction of the binocular camera component is taken as the forward direction; the first image collector and the second image collector are arranged at intervals in the left-right direction, for example, the first image collector is located on the left side of the second image collector, or the first image collector is located on the right side of the second image collector. Taking the first image collector located on the left side of the second image collector as an example, correspondingly, the first image is a left view in a binocular image, the second image is a right view in the binocular image, and the left view and the right view form a binocular view of shooting participation corresponding to the binocular camera shooting assembly.

Further, the laser projection assembly 20 is configured to emit a projected texture, wherein the projected texture emitted by the laser projection assembly can be sensed by the first image collector and the second image collector. It can be understood that the laser projection assembly projects the projection texture into the shooting scene of the binocular camera assembly and covers the shooting scene of the binocular camera assembly, so that the left view shot by the first image collector and the right view shot by the second image collector in the binocular camera assembly both comprise the projection texture. The projection texture is a texture image which is displayed on the surface of an object after a laser beam strikes the surface of the object in a shooting scene, covers the surface of the object and can be recognized and recorded by the binocular camera shooting assembly. For example, the projected texture is a texture image as shown in fig. 2. In addition, in one implementation manner of this embodiment, the laser projection component is located between the first image collector and the second image collector. Of course, in practical application, the laser projection assembly may be disposed at other positions as well, as long as the projection texture emitted by the laser projection assembly can be projected onto the object surface in the shooting scene, so that the binocular camera assembly can shoot the projection texture.

Further, the microprocessor is in communication connection with the binocular camera shooting assembly 10 and the laser projection assembly 20 respectively, and the microprocessor is used for controlling the laser projection assembly 20 to emit projection textures at intervals and controlling the binocular camera shooting assembly 10 to acquire binocular images of a shooting scene. Wherein the microprocessor controls the laser projection assembly 20 to emit the projected texture at a time interval, and one of the two adjacent emitted projected textures emits the projected texture and one emits no projected texture. It can be understood that, at two acquisition moments when the binocular camera assembly 10 acquires the binocular images of the shooting scene two times adjacent to each other, the laser projection assembly 20 transmits the projection texture once and does not transmit the projection texture once, so that one set of binocular images includes the projection texture and one set of binocular images does not include the projection texture in two adjacent sets of binocular images acquired by the binocular camera assembly 10. That is, the binocular camera assembly 10 alternately acquires a first binocular view including the projection texture and a second binocular view not including the projection texture. For example, the binocular camera module 10 acquires a binocular image a, a binocular image B, a binocular image C, and a binocular image D according to time limits, and assuming that the binocular image a is a second binocular image, the binocular image B is a first binocular image, the binocular image C is a second binocular image, and the binocular image D is a first binocular image.

Further, in an implementation manner of this embodiment, the first binocular image is used to measure depth information, the second binocular image is used to determine positioning information corresponding to the second binocular image, and is used to determine positioning information corresponding to the depth image in combination with the depth map, where for each depth map, the second binocular image that assists in determining the positioning information corresponding to the depth map is the second binocular image that is located before the depth map according to time sequence, and an acquisition time corresponding to the second binocular image is adjacent to an acquisition time of the first binocular image corresponding to the depth map. For example, the vision sensor acquires a second binocular image a, a first binocular image a, a second binocular image B and a first binocular image B according to the acquisition timing, the second binocular image a is used for determining the positioning information corresponding to the second binocular image a, the first binocular image a is used for determining the depth map a, and the second binocular image a is used for assisting the depth map a in determining the positioning information corresponding to the first binocular image a.

Based on this, the microprocessor is further configured to determine a depth map based on the first binocular view, and determine positioning information corresponding to the depth map based on a second binocular view corresponding to the depth map and the depth map. Accordingly, as shown in fig. 3, the working process of the microprocessor may be: the method comprises the steps that a microprocessor controls a binocular shooting assembly to collect binocular images of a shooting scene and detects whether the binocular images comprise projection textures, if the binocular images comprise the projection textures, depth information is calculated based on the binocular images to obtain a depth map, positioning information corresponding to the depth map is determined based on the depth map and previous binocular images collected at the previous collecting moment of the binocular images, and the next frame of images are entered; if the binocular image does not contain the projection texture, directly outputting the binocular image and entering the next frame image; in addition, when entering the next frame of image, detecting the working state of the laser projection assembly, if the laser projection assembly is in the starting state, controlling the laser projection assembly to be closed, and continuously executing the step of controlling the binocular shooting assembly to acquire binocular images of a shooting scene after the laser projection assembly is closed; and if the laser projection assembly is in a closed state, controlling the laser projection assembly to be started, and continuously executing the step of controlling the binocular shooting assembly to collect the binocular images of the shooting scene after the laser projection assembly is started until the collection is finished.

In an implementation manner of this embodiment, the process of determining the depth map based on the first binocular view may be: taking a left view acquired by a left camera (a first image acquirer) as a reference image, and taking a left view acquired by a right camera (a second image acquirer) as an image to be matched; firstly, selecting an image block Q centered on a pixel point (x, y) on a reference image, then performing line scanning on an image to be matched, determining the correlation between each position to be matched in the image to be matched and the image block Q, and selecting a position to be matched with the highest correlation as a matching position (e.g., a matching position (xa, ya) shown in fig. 4) corresponding to the image block Q; finally, after the matching position is obtained, determining the depth value of the pixel point (x, y) according to the matching position in the image block in the left view and the matching position in the right view, wherein the calculation formula of the depth value h may be:

wherein f is the focal length of the first image collector/the focal length of the second image collector, B is the focal length of the first image collector/the focal length of the second image collector, d₀Is the disparity of the image block Q and its corresponding matching position.

Certainly, in practical application, after the reference image and the image to be matched are acquired for line scanning, the reference image and the image to be matched are respectively corrected based on respective corresponding calibration parameters, so that for each image block in the reference image, the matching position of the image block in the image to be matched and the image block are in a line, and thus the image blocks can be matched on line.

In an implementation manner of this embodiment, the determining, based on the second binocular view corresponding to the depth map and the depth map, the positioning information corresponding to the depth map specifically includes:

Specifically, the second binocular view is an adjacent binocular view of the first binocular image corresponding to the depth map in time series, and an acquisition time of the second binocular view is before an acquisition time of the first binocular image. The pixel point is a 3D space point in the depth map and is denoted as a point P (X, Y, Z), and as shown in fig. 5, the space point P (X, Y, Z) may correspond to a projection point on the left view and a projection point on the right view, where the projection point on the left view is denoted as a first projection point Q0(X0, Y0) and the projection point on the right view is denoted as a second projection point Q1(X1, Y1). The first projection point Q0(X0, Y0) and the second projection point Q1(X1, Y1) may be determined according to the correspondence between the position state information of the binocular camera module corresponding to the depth map and the position state information of the binocular camera module corresponding to the second binocular image, and the spatial point P.

Further, it is assumed that the correspondence between the position state information of the binocular imaging assembly corresponding to the depth map and the position state information (including position information and attitude information) of the binocular imaging assembly corresponding to the second binocular image is as follows:

wherein R is_iIs a relative rotation matrix, T_iIs a translation vector; then the first projection point and the second projection point corresponding to the spatial point P may be:

after the binocular camera shooting assembly is calibrated and corrected, the position state information of the left view and the right view meets the following relation:

R₁＝R₀

wherein the content of the first and second substances,

and calibrating and determining the distance between a first image collector and a second image collector in the binocular camera shooting assembly according to the binocular camera shooting assembly.

Based on this, the first image block corresponding to the projection point corresponding to the spatial point P on the left view and the second image block corresponding to the projection point corresponding to the spatial point P on the right view are the same. Therefore, a first image block corresponding to the first projection point can be selected on the left view, a second image block corresponding to the second projection point can be selected on the right view, and an image block difference value corresponding to the space point P is determined based on the first image block and the second image block. In addition, after image block difference values corresponding to each pixel point in the depth map are obtained, positioning information corresponding to the depth map can be determined based on all the obtained image block difference values, wherein the positioning information comprises position information and attitude information. The calculation of the positioning information may be converted into an optimization problem, where the optimization problem may be:

the initial value is positioning information obtained by last solving, and N is the number of pixel points in the depth map.

Based on this, in an implementation manner of this embodiment, the determining, based on the image block difference value corresponding to each pixel point in the depth map, the positioning information corresponding to the depth map specifically includes: determining a change parameter of the depth map relative to the second binocular view based on the image block difference value corresponding to each pixel point in the depth map; and determining the positioning information corresponding to the depth map based on the variation parameter and the positioning information corresponding to the second binocular view.

Specifically, after the optimization problem is determined, an optimal solution of the optimization problem (e.g., the optimal solution is R at which the optimization problem takes a minimum value) is determined₀,T₀) After the optimal solution is obtained, the optimal solution is used as a change parameter, and the positioning information corresponding to the depth map can be determined based on the positioning information of the second binocular view and the change parameter. According to the method and the device, the positioning information is determined based on the depth map and the second binocular image, and image feature points do not need to be extracted, so that the problem that a traditional visual positioning algorithm is prone to being affected by environment textures is solved, and the accuracy of the positioning information is improved.

Further, in an implementation manner of this embodiment, as shown in fig. 1, the vision sensor may further include an infrared light supplement component 30, where the infrared light supplement component 30 is connected to the microprocessor and configured to send infrared light to supplement light for the binocular shooting component 10. The infrared light supplement component 30 may be disposed in the first image collector and the second image collector, so that the infrared light supplement component can supplement light for the first image collector and the second image collector at the same time, and the difference between the intensity of the light supplement for the first image collector and the intensity of the light supplement for the second image collector is small, so that the image quality of the first image collected by the first image collector is similar to the image quality of the second image collected by the second image collector. Of course, in practical application, the infrared light supplement component may be disposed at other positions as long as the supplementary lighting is available for the binocular camera shooting component.

In addition, the infrared light supplement component 30 is configured to supplement light for the binocular camera shooting component 10 when the laser projection component 20 is in a closed state, so that infrared light generated by the infrared light supplement component 30 does not affect laser generated by the laser projection component 20, and accuracy of determining depth information based on projected texture projected by the laser projection component 20 is improved. In an implementation manner of this embodiment, when the laser projection assembly is in the on state, the infrared light supplement assembly 30 is in the off state. In addition, the infrared light supplement component 30 is configured to supplement light for the binocular shooting component 10 when the ambient light does not satisfy a preset condition, where the preset condition is that the ambient light intensity is lower than a preset brightness threshold value, and the like.

Based on this, microprocessor can also be used for acquireing ambient light intensity to infrared light filling subassembly of control is opened or is closed based on ambient light intensity. The specific process of the microprocessor controlling the infrared light supplement assembly to be turned on or turned off can be as follows: the microprocessor acquires the ambient light intensity, detects whether the ambient light intensity meets a preset condition, and controls the infrared light supplementing assembly to be in a normally closed state if the ambient light intensity meets the preset condition; if the ambient light intensity does not meet the preset condition, monitoring the working state of the laser projection assembly; when the laser projection assembly is in a closed state, starting the infrared light supplementing assembly to supplement light for the binocular camera shooting assembly; and when the laser projection assembly is in an opening state, the infrared light supplementing assembly is closed. It can be understood that when the ambient light intensity does not satisfy the preset condition, the microprocessor controls the infrared light supplementing assembly to be alternately turned on and turned off, and the working state of the infrared light supplementing assembly is opposite to that of the laser projection assembly, that is, when the laser projection assembly is in the turned-off state, the infrared light supplementing assembly is in the turned-on state, and otherwise, when the laser projection assembly is in the turned-on state, the infrared light supplementing assembly is in the turned-off state.

In summary, the present embodiment provides an active dual-purpose based vision sensor, which includes: the system comprises a binocular camera shooting assembly, a laser projection assembly and a microprocessor, wherein the microprocessor is in communication connection with the binocular camera shooting assembly and the laser projection assembly respectively; the microprocessor is used for controlling the laser projection assembly to be started at intervals, so that the binocular camera shooting assembly alternately collects a first binocular view containing the projection texture and a second binocular view not containing the projection texture, and by using an interval type texture projection mode, visual positioning and 3D depth measurement can be achieved simultaneously, the robot system only needs to be provided with one visual sensor, and the hardware cost of the robot is reduced.

Based on the active dual-purpose based vision sensor, the invention further provides a robot, and the robot is loaded with the active dual-purpose based vision sensor.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An active dual purpose based vision sensor, the vision sensor comprising:

2. The active dual purpose based vision sensor of claim 1, wherein the microprocessor is further configured to determine a depth map based on the first dual purpose view, and to determine positioning information corresponding to the depth map based on a second dual purpose view corresponding to the depth map and the depth map.

3. The active binocular based vision sensor of claim 2, wherein the second binocular view corresponding to the depth map is a second binocular view that is temporally located before the first binocular view corresponding to the depth map, and the second binocular view corresponding to the depth map is temporally adjacent to the first binocular view corresponding to the depth map.

4. The active binocular based vision sensor of claim 2 or 3, wherein the determining the positioning information corresponding to the depth map based on the second binocular view corresponding to the depth map and the depth map specifically comprises:

5. The active-dual-purpose based vision sensor according to claim 4, wherein the determining the positioning information corresponding to the depth map based on the image block difference value corresponding to each pixel point in the depth map specifically comprises:

6. The active dual purpose based vision sensor of claim 1, wherein the vision sensor comprises:

7. The active dual-purpose based vision sensor according to claim 6, wherein the infrared light supplement component is in an off state when the laser projection component is in an on state.

8. The active dual-purpose based vision sensor according to claim 1, wherein the binocular camera assembly comprises a first image collector and a second image collector, the first image collector and the second image collector are arranged at intervals, and the laser projection assembly is arranged between the first image collector and the second image collector.

9. A robot, characterized in that it is loaded with an active binocular based vision sensor according to any one of claims 1-8.