CN116977600B

CN116977600B - XR equipment and XR equipment height acquisition method

Info

Publication number: CN116977600B
Application number: CN202310803039.6A
Authority: CN
Inventors: 周子越; 王俊; 张腾; 张逸伦; 李卓
Original assignee: Play Out Dreams Shanghai Technology Co ltd
Current assignee: Play Out Dreams Shanghai Technology Co ltd
Priority date: 2023-07-03
Filing date: 2023-07-03
Publication date: 2024-04-09
Anticipated expiration: 2043-07-03
Also published as: CN116977600A

Abstract

The invention provides an XR device and an XR device height acquisition method, wherein the XR device height acquisition method comprises the following steps: the method comprises a coordinate system construction step, a three-dimensional point cloud acquisition step, a height partitioning step, a point counting step, a peak value interval searching step and a height calculating step. This approach allows the XR device to infer its own height in real time and automatically.

Description

XR equipment and XR equipment height acquisition method

Technical Field

The invention relates to the field of computer vision, in particular to an XR device and an XR device height acquisition method.

Background

The XR device is a generic name for intelligent head-mounted devices such as VR devices, AR devices, MR devices, and the like, and is an electronic product capable of providing a virtual world immersion experience. In the field, the 6DoF technology which is mainly relied on by visual SLAM (Simultaneous Localization and Mapping, instant positioning and map construction) refers to that a robot starts to move from an unknown position in an unknown environment, self-positioning is carried out according to the position and a map in the moving process, and meanwhile, an incremental map is built on the basis of self-positioning, so that autonomous positioning and navigation of the robot are realized.

While satisfying the positioning of the XR device itself 6DoF, it is often also necessary to calculate the ground height, that is, the distance between the XR device and the ground, and then in the virtual scene, the ground with a height consistent with the real world can be rendered, which can provide a more realistic virtual experience and enhance the immersion of the user.

Many XR devices currently do not support the function of real-time collection or calculation of the XR device height, but instead use a default value to replace the XR device height or use the handle to XR device distance to infer the XR device height. Default values tend to be inaccurate due to differences in the height of the user and the variety of gestures in which the user uses the device. The method for calculating the height of the XR equipment by using the handle tracking algorithm is slightly complicated by placing the handle on the ground, and the user is required to perform additional operation of placing the handle in a bending manner, so that the experience of the user can be reduced.

Disclosure of Invention

The invention aims to provide an XR device and an XR device height acquisition method, which are used for solving the technical problem that the XR device in the prior art cannot acquire and calculate the XR device height in real time.

In order to solve the above problems, the present invention provides an XR device and an XR device height collection method, wherein the XR device height collection method comprises the following steps: a coordinate system construction step, namely, a world coordinate system is established, and the Z-axis direction and the gravity direction of the world coordinate system are on the same straight line; a three-dimensional point cloud acquisition step, namely acquiring a real-time environment image through a binocular camera of the XR equipment, and acquiring a three-dimensional point cloud corresponding to the environment space of the XR equipment according to the real-time environment image, wherein the three-dimensional point cloud is a point set for expressing the spatial distribution and the surface spectroscopies of the real-time environment image; a height division step of dividing the world coordinate system into a plurality of equal-height sections R along the Z-axis direction with a point on the XR device as a starting point _n N is a natural number; counting the number of points, namely counting the ith interval R _i Number of points of internal distribution X _i I is a section number of n or less, and a series X is obtained _n X is taken as _n Modifying the number smaller than a preset threshold value to zero; a peak value interval searching step, in which the number of the sequences X is in the direction from the large to the small of n _n Find the first local maximum X _imax Find and X _imax Corresponding peak interval R _imax The method comprises the steps of carrying out a first treatment on the surface of the A height calculating step of calculating the height of the XR device, which is the peak interval R _imax Distance from the starting point.

Further, before the coordinate system constructing step, the XR device height acquisition method further comprises an initializing step, wherein parameters of a binocular camera of the XR device are acquired, and a mapping matrix for binocular stereo correction is calculated according to the parameters.

Further, the three-dimensional point cloud acquisition step specifically includes the following steps: a data acquisition step of acquiring a binocular camera image of a current frame, including a first image and a second image; a binocular stereo correction step, namely performing binocular stereo correction processing on the binocular camera image by using a mapping matrix to obtain a binocular camera corrected image; binocular stereo matching, namely performing binocular stereo matching on the corrected images of the binocular cameras to obtain parallax images of the first image and the second image; a three-dimensional point cloud computing step, namely computing a coordinate set of a three-dimensional point cloud under a binocular camera coordinate system corresponding to the parallax map according to the parallax map; and a coordinate conversion step of converting a coordinate set of the three-dimensional point cloud under the binocular camera coordinate system into the world coordinate system.

Further, in the binocular stereo correction step,

dst(x,y)＝src(map_x(x,y),map_y(x,y))

where src is a binocular camera image, dst (x, y) is a corrected image of the binocular camera after correction processing, and map_x, map_y are mapping matrices.

Further, the binocular stereo matching step specifically includes the following steps: a semi-dense optical flow calculation step of calculating a semi-dense optical flow according to a binocular camera image to obtain motion vectors of a plurality of pixel points in the binocular camera image; a motion vector screening step of screening motion vectors in the horizontal direction from the motion vectors of the plurality of pixel points; and a parallax map construction step, namely constructing a parallax map of the binocular camera image according to the parallax value of each pixel point.

Further, in the three-dimensional point cloud computing step, coordinates (X, Y, Z) of any pixel point in the three-dimensional point cloud space are:

the base line length of the binocular camera is b, the focal length is f, the principal point is (cx, cy), the pixel point coordinates are (u, v), and the disparity value is disparity (u, v).

Further, in the height calculating step, the peak section R _imax The distance from the starting point is equal to the peak interval R _imax And a difference between the Z value corresponding to the bisector in the Z axis direction and the Z value of the starting point.

Further, the height calculating step specifically includes the steps of: a secondary height partitioning step, which is divided into intervals R _imax-1 As a starting point, along the Z-axis direction, a peak interval R _imax And adjacent thereto the interval R _imax-1 And R is R _imax+1 A second section T divided into a plurality of equal heights _m M is a natural number; a second point counting step of counting the ith interval T _i Number Y of pixel points distributed internally _i I is a second interval number less than or equal to m, and a second array Y is obtained _m Y is taken as _m Modifying the number smaller than a preset threshold value to zero; a secondary peak interval searching step, in which m is from large to small, in the number sequence Y _m Find the first local maximum Y _mmax Find and match Y _mmax Corresponding second peak interval T _mmax The method comprises the steps of carrying out a first treatment on the surface of the A secondary height calculating step of calculating the second peak interval T _mmax The difference between the Z value corresponding to the bisector in the Z-axis direction and the Z value of the starting point is equal to the peak section T _mmax Distance from the starting point.

The invention also provides an XR device comprising a memory and a processor. The memory is used for storing executable program codes; the processor is configured to read the executable program code to execute a computer program corresponding to the executable program code to perform at least one step of the XR device height collection method.

Further, the XR device also includes a housing, a binocular camera, and an IMU sensor. The shell comprises a circuit board for installing the memory and the processor; the binocular camera is mounted to the housing and electrically connected to the processor. An IMU sensor is mounted to the housing and electrically connected to the processor.

The invention has the advantages that an XR device and an XR device height acquisition method are provided, a world coordinate system is established, a real-time environment image comprising a ground image is acquired in real time through a binocular camera of the XR device, and a three-dimensional point cloud consisting of a plurality of points corresponding to the environment space where the XR device is located is acquired according to the real-time environment image. The method comprises the steps of dividing the environment space of the XR equipment into a plurality of sections with equal height along the gravity direction, counting the number of points in each section, finding out the section where the local maximum value of the number of the first points is located, namely the section where the ground is located, calculating the distance between the section and a starting point to calculate the height of the XR equipment, and enabling the XR equipment to automatically estimate the height of the XR equipment in real time. The XR equipment height acquisition method provided by the invention can also divide the environment space where the XR equipment is located for a plurality of times along the gravity direction, and the calculation accuracy is improved while the calculation amount is reduced by reducing the execution times of the point counting steps.

Drawings

FIG. 1 is a schematic diagram of an XR device according to embodiments 1 and 2 of the present invention;

FIG. 2 is a flow chart of the XR device height acquisition method of embodiments 1 and 2 of the present invention;

FIG. 3 is a flowchart showing the three-dimensional point cloud acquisition steps in embodiments 1 and 2 of the present invention;

FIG. 4 is a corrected binocular camera image of the present invention of examples 1 and 2;

FIG. 5 is a flow chart of the binocular stereo matching step of the present invention in examples 1 and 2;

FIG. 6 is a three-dimensional point cloud of binocular camera images of embodiments 1 and 2 of the present invention;

FIG. 7 is a flow chart of the step of highly presumption in embodiments 1 and 2 of the present invention;

FIG. 8 is a flowchart showing the steps of height calculation in embodiment 2 of the present invention;

fig. 9 is a schematic diagram showing the result of the height calculation step in embodiment 2 of the present invention.

The labels in the figures are as follows:

1 memory, 2 processor, 3 cameras, 4IMU sensors.

Detailed Description

The following description of the preferred embodiments of the present invention, with reference to the accompanying drawings, is provided to illustrate that the invention may be practiced, and these embodiments will fully describe the technical contents of the present invention to those skilled in the art so that the technical contents of the present invention may be more clearly and conveniently understood. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.

Example 1

As shown in fig. 1, the present embodiment provides an XR device comprising a housing with a circuit board mounted therein, a memory 1 and a processor 2. The memory 1 is used to store executable program code. The processor 2 is configured to read the executable program code to execute a computer program corresponding to the executable program code to perform at least one step of a method for highly acquiring XR equipment.

The XR device further comprises a plurality of cameras 3, the cameras 3 being mounted to the housing surface and being electrically connected to the processor 2; the lenses of at least two cameras 3 are directed directly in front, directly below and/or obliquely below. Among them, two cameras 3 arranged in parallel in the lateral direction are called a left-eye camera and a right-eye camera, respectively, and are also called a binocular camera. When the XR device is worn on the head of a user, the left-eye camera and the right-eye camera are respectively arranged opposite to the positions of the eyes of the user, and the lenses of the left-eye camera and the right-eye camera face to the front, the right below and/or the oblique below, so that all or part of images of the ground can be displayed in the field of view.

The XR device further comprises an IMU sensor 4, a sensor for detecting and measuring acceleration and rotational movement. The IMU sensor 4 includes an accelerometer and a gyroscope, which are core components of the inertial system, and are the main factors affecting the performance of the inertial system.

Based on the aforementioned XR device, the present embodiment further provides a method for highly collecting the XR device, which is implemented by the XR device in a software form.

As shown in fig. 2, the XR device height collection method of the present embodiment includes the following steps S1 to S4.

Step S1: and initializing, namely acquiring parameters of a binocular camera of the XR equipment, and calculating a mapping matrix for binocular stereo correction according to the parameters. The parameters of the binocular camera include left-eye camera parameters, right-eye camera parameters, and associated parameters of the left-eye camera and the right-eye camera.

For a binocular camera image, the mapping matrix map_x records the abscissa of each point in the binocular camera image and the mapping matrix map_y records the ordinate of each point in the binocular camera image.

Step S2: in the coordinate system construction step, the world coordinate system used in this embodiment is a 6DoF coordinate system based on SLAM technology, and the process of constructing the world coordinate system is as follows.

Firstly, defining an initial coordinate system at the position of the IMU at the moment of a first image frame, then, carrying out joint optimization by utilizing accelerometer and gyroscope data acquired by the IMU within a certain window time and image information acquired by a camera to obtain a gravity direction vector under the initial coordinate system, and newly defining a world coordinate system, wherein the Z axis of the world coordinate system is aligned with the gravity direction.

Step S3: and a three-dimensional point cloud acquisition step, namely acquiring a real-time environment image through a binocular camera of the XR equipment, and acquiring a three-dimensional point cloud corresponding to the environment space of the XR equipment according to the real-time environment image, wherein the three-dimensional point cloud is a point set for expressing the spatial distribution and the surface spectroscopical property of the real-time environment image. The real-time environment image is an image of a space where the XR equipment is located, when the XR equipment is worn on the head of a user, the left-eye camera and the right-eye camera are respectively arranged opposite to the positions of two eyes of the user, and the images or videos shot by the two-eye cameras comprise images of various objects in the field of view of the XR equipment. The position of various objects within the space in which the XR device is located may be coordinated using a particular algorithm, each object being a combination of three-dimensional point clouds, each three-dimensional point cloud comprising a large number of points. In general, when an XR device performs an XR experience in a room, since points in a three-dimensional point cloud are uniformly distributed on the surface of an object, and meanwhile, the surface area of the ground facing the binocular camera in the room is larger, points in the three-dimensional point cloud representing the ground of the room are more, after the object in the room is coordinated, after the whole three-dimensional space is divided into a plurality of flat spaces with equal height according to the height, the points of the flat space where the ground is located are more, and the points of the flat space where the ground is located are more than the points of the adjacent flat spaces. Since the origin of the coordinate system is located on the XR device, the distance of the flat space from the XR device is representative of the height of the XR device relative to the ground.

As shown in fig. 3, the three-dimensional point cloud acquisition step specifically includes steps S31 to S34.

Step S31: and a data acquisition step of acquiring a binocular camera image of the current frame, including a left-eye camera image and a right-eye camera image, by a binocular camera of the XR device. The acquisition interval of real-time acquisition is 0.1 to 1 second.

Step S32: and a binocular stereo correction step, namely performing binocular stereo correction processing on the binocular camera image by using the mapping matrix to obtain a corrected image of the binocular camera.

As shown in fig. 4, the effect of the binocular stereo correction step is to facilitate finding the corresponding two points in two different photographs. In general, traversing all pixels over a picture is computationally expensive. The present embodiment simplifies the search problem to one dimension using epipolar constraints, where the corresponding point of one picture on another picture is located on the epipolar line of the other picture. However, because the epipolar lines are mostly tilted, it is often inefficient to use image blocks to match, and therefore it is necessary to process the epipolar lines parallel to the baseline. Therefore, the embodiment performs binocular stereo correction to ensure that left and right binocular images are parallel.

In this embodiment, a remap () function is used, where the remap () function performs a remap geometric transformation on the original image according to a specified mapping form, based on the following formula:

dst(x,y)＝src(map_x(x,y),map_y(x,y))

The remap () function prototype is as follows:

void remap(InputArray src,OutputArray dst,InputArray map1,InputArray map2,int interpolation,int borderMode＝BORDER_CONSTANT,const Scalar&borderValue＝Scalar())

map1, representing the first mapping of points (X, y) or representing the X value of CV_16S, CV_32FC1 or CV_32Fc2 type.

If map1 represents a point (x, Y), map2 does not represent any value, otherwise map2 represents a Y value of the type CV_16UC1, CV_32FC1.

interpolation is an interpolation method, where nearest neighbor interpolation is not supported, so an alternative interpolation method is as follows: inter_line: bilinear interpolation; inter_cubic: performing bicubic spline interpolation; inter_lanczos4: lanczos interpolation;

borderMode is a boundary mode with a default value BORDER_CONSTANT, indicating that the pixels of the "outlier" in the target image are not modified by this function;

borderValue is a value used when there is a constant boundary, default to 0.

Step S33: and binocular stereo matching, namely performing binocular stereo matching on the corrected images of the binocular cameras to obtain parallax images of the left-eye camera images and the right-eye camera images.

Stereo matching generally refers to dense matching of pixels, which means that each pixel gets a disparity value, and these disparity values need to be stored in a two-dimensional map. On one hand, the disparity value of the corresponding position can be quickly found in the two-dimensional graph through the pixel coordinates, and the disparity value is orderly as the image; on the other hand, the quality of the parallax map can be intuitively judged preliminarily by observing the comparison of the parallax map and the original map.

As shown in fig. 5, the binocular stereo matching step specifically includes steps S331 to S333.

Step S331: and calculating the semi-dense optical flow according to the binocular camera image to obtain motion vectors of a plurality of pixel points in the binocular camera image.

Step S332: a motion vector screening step of screening motion vectors in the horizontal direction from the motion vectors of the plurality of pixel points; the motion vector in the horizontal direction is an effective motion vector, the parallax value of the pixel point corresponding to the effective motion vector is the value of the effective motion vector, the motion vector in the non-horizontal direction is an ineffective motion vector, and the pixel point corresponding to the ineffective motion vector is set as an ineffective value.

Step S333: and a disparity map construction step, namely constructing a disparity map of the binocular camera image according to the disparity value of each pixel point.

The parallax d is equal to the column coordinates of the homonymous point pair in the left view minus the column coordinates on the right view, and is a pixel unit.

The disparity map is a two-dimensional image and has the same size as the original image. The computation is performed using semi-dense optical flow, with each 8 x 8 pixel block having a disparity at its center position.

Step S34: and a three-dimensional point cloud calculating step, namely calculating a coordinate set of the three-dimensional point cloud under a binocular camera coordinate system corresponding to the parallax map according to the parallax map.

The binocular camera coordinate system is a three-dimensional coordinate system established by taking a left-eye camera or a right-eye camera as an origin, and changes in real time along with the change of the pose of the binocular camera.

Coordinates (X, Y, Z) of any pixel point in the three-dimensional point cloud space are:

The three-dimensional point cloud is a point set expressing the spatial distribution and the surface spectroscopies of a real-time environment image, and is a model formed by a series of points on the surface of an object in the real-time environment image.

As shown in fig. 6, since the three-dimensional point cloud only includes a series of points on the object surface, the three-dimensional point cloud has the characteristic of sparsity. The sparsity of the three-dimensional point cloud is represented by the fact that in a practical situation, only data on the face facing the camera is collected, and the more far from the camera, the sparse the point cloud data. In summary, the number of points on the ground surface and near the ground surface is greater, and the number of points in the flat space on the ground surface is greater than the number of points in the adjacent flat spaces.

Step S35: and a coordinate conversion step of converting a coordinate set of the three-dimensional point cloud under the binocular camera coordinate system into the world coordinate system.

Each image frame has a corresponding transformation matrix T that converts three-dimensional points in the camera coordinate system into the world coordinate system.

Step S4: and a height estimation step, namely taking a point on the XR equipment as a starting point, dividing the world coordinate system into a plurality of sections with equal height along the Z-axis direction, counting the number of points in the sections, finding out the section where the first local maximum value is located from bottom to top, and enabling the local maximum value to be larger than a preset threshold value, and calculating the distance between the section and the XR equipment to estimate the height of the XR equipment, so that the XR equipment can estimate the height of the XR equipment automatically in real time.

As shown in fig. 7, the height estimation step specifically includes steps S41 to S44.

Step S41: a height dividing step of dividing the world coordinate system into a plurality of sections R of equal height along the Z-axis direction with a point on the XR equipment as a starting point _n N is a natural number.

The larger the value of n, the more R is each interval _n The smaller the height of (c), the higher the calculation accuracy, but the greater the calculation amount to be paid out at the same time.

Step S42: counting the number of points, namely counting the ith interval R _i Number of points of internal distribution X _i I is a section number of n or less, and a series X is obtained _n 。

The counting step is to count the number of points of the point cloud distributed in each interval.

Step S43: a peak value interval searching step, in which the number of the sequences X is in the direction from the large to the small of n _n Find the first local maximum X _imax Find and X _imax Corresponding peak interval R _imax ；

For local maxima X _imax It is required to be greater than a preset threshold to exclude the effect of noise.

In the direction of n from large to small, there may be multiple peaks, in which case it is necessary to find the globally optimal local maximum X _imax Instead of a locally optimal maximum, the plane in which the ground lies must lie at a local maximum X _imax Corresponding peak interval R _imax Nearby.

The meaning of taking the first local maximum value from the large to the small direction of n is that if a plurality of tables exist in the environment image, the number of points contained near the plane where the ground is located is not the largest, at this time, taking the local maximum value of the first number of points from the large to the small direction of n, namely from the low to the high direction under the world coordinate system, and the local maximum value needs to be larger than a given threshold value, the section where the local maximum value of the number of points is located is the section where the plane where the ground is located.

Step S44: a height calculating step of calculating the height of the XR device, which is the peak interval R _imax Distance from the starting point. To increase accuracy, the peak interval R _imax The distance from the starting point is equal to the peak interval R _imax The difference between the Z value corresponding to the bisector in the Z axis direction and the Z value of the starting point.

For example, in the technical solution of this embodiment, an appropriate n value is first taken to make n equal to any value of 200-300, and the environment space is divided into 200-300 equal-height regions from top to bottom along the Z-axis directionEach interval is a flat space, the height of each interval is 0.6-1.6 cm, the number of points in each interval is counted, and the peak value interval is found to be R _imax Since the plane on which the ground is located is necessarily located in the peak interval R _imax Nearby, the XR device height is equal to the peak interval R _imax The difference between the Z value corresponding to the bisecting plane in the Z axis direction and the Z value of the starting point is within 1.6 cm.

The advantage of this embodiment is that it provides an XR device and a method for collecting the height of the XR device, establishes a world coordinate system, collects real-time environmental images including ground images in real time by a binocular camera of the XR device, and obtains a three-dimensional point cloud composed of a plurality of points corresponding to the environmental space where the XR device is located according to the real-time environmental images. The method comprises the steps of dividing the environment space of the XR equipment into a plurality of sections with equal height along the gravity direction, counting the number of points in each section, finding out the section where the local maximum value of the number of the first points is located, namely the section where the ground is located, calculating the distance between the section and the starting point to calculate the height of the XR equipment, and enabling the XR equipment to automatically estimate the height of the XR equipment in real time.

Example 2

The XR device height collection method described in embodiment 1 has the disadvantages of large calculation amount and slow calculation speed; meanwhile, the calculation accuracy is insufficient, and the error range is about 0.6-1.6 cm. The technical scheme of embodiment 2 can reduce the calculated amount, improve the calculation speed, reduce the calculation error and improve the calculation precision.

The hardware configuration of embodiment 2 is substantially the same as that of embodiment 1, and this embodiment provides an XR device, which includes a housing, and a circuit board disposed in the housing, and a memory 1 and a processor 2 mounted on the circuit board. The memory 1 is used to store executable program code. The processor 2 is configured to read the executable program code to execute a computer program corresponding to the executable program code to perform at least one step of a method for highly acquiring XR equipment. The XR device further comprises a plurality of cameras 3 mounted to the housing surface and electrically connected to the processor 2. Among them, two cameras 3 arranged in parallel in the lateral direction are called a left-eye camera and a right-eye camera, respectively, and are also called a binocular camera. When the XR device is worn on the head of a user, the left-eye camera and the right-eye camera are respectively arranged opposite to the positions of the eyes of the user, and the lenses of the left-eye camera and the right-eye camera face to the front, the right below and/or the oblique below, so that all or part of images of the ground can be displayed in the field of view.

As shown in fig. 8, the XR device height collection method of this embodiment includes all steps S1 to S4 of the XR device height collection method of embodiment 1, and the difference between this embodiment and embodiment 1 is that the step S44 height calculation step specifically includes steps S441 to S444.

Step S441: a secondary height partitioning step, which is divided into intervals R _imax-1 As a starting point, along the Z-axis direction, a peak interval R _imax And adjacent thereto the interval R _imax-1 And R is R _imax+1 A second section T divided into m equal-height sections _m M is a natural number;

it should be noted that the plane in which the ground is located does not necessarily lie in the peak region R _imax In the section R, the plane where the ground is located _imax-1 Or R is _imax+1 In this case, it is also possible to do so. So to ensure the accuracy of the calculation, the peak interval R needs to be set _imax And adjacent thereto the interval R _imax-1 And R is R _imax+1 A second section T divided into a plurality of equal heights _m 。

Step S442: a second point counting step of counting the ith interval T _i Number Y of pixel points distributed internally _i I is a second interval number less than or equal to m, and a second array Y is obtained _m ；

The secondary point counting step is used for counting the number of points distributed in each interval of the point cloud again.

Step S443: a secondary peak interval searching step, in the direction from m to m,in the array Y _m Find the first local maximum Y _mmax Find and match Y _mmax Corresponding second peak interval T _mmax ；

For local maxima Y _mmax It is required to be greater than a preset threshold to exclude the effect of noise.

Finding the position of the peak value interval, the plane of the ground is necessarily located in the second peak value interval T _mmax Nearby.

Step S444: a secondary height calculating step of calculating the second peak interval T _mmax The difference between the Z value corresponding to the bisector in the Z-axis direction and the Z value of the starting point is equal to the peak section T _mmax Distance from the starting point.

Further, steps S441 to S444 may be repeated to further divide the nth peak section and the section adjacent to the nth peak section into a plurality of sections, count the number of points distributed in each section, find the section corresponding to the local maximum value of the points, and when the section cannot be divided further, the difference between the Z value corresponding to the facet and the starting point Z value in the section is equal to the height of the XR device, and the calculated height has the highest data precision.

It should be noted that, in theory, n may be maximized in the first segmentation, and the environment space may be directly segmented into n sections with the smallest height. But this is costly in that the calculation amount will become very large, and the XR device will repeatedly perform the height estimation step S4 in the present embodiment at a higher frequency, since the height estimation is performed in real time, so the scheme in embodiment 1 lacks practicality compared to the scheme in the embodiment.

As shown in fig. 9, the present embodiment provides a scheme of dividing the space into multiple sections, taking a proper n value to make n equal to 20, dividing the environment space into 20 sections, counting the number of points in each section, and finding the peak value section as R ₄ It can be known that the plane where the ground is located is necessarily located in the peak interval R ₄ Nearby. R is then taken up ₃ 、R ₄ R is as follows ₅ Merging and splitting again, let m equal to 6, splitting R ₃ 、R ₄ R is as follows ₅ Re-dividing into 6 sections, counting the number of points in each section, and finding out the peak section as T ₄ It can be known that the plane where the ground is located is necessarily located in the peak interval T ₄ Nearby. Then T is taken up ₃ 、T ₄ T is as follows ₅ Combining and re-dividing the two sections into 15 sections, counting the number of points in each section, finding out a peak section, wherein the difference between the value of the section on the Z axis and the Z value of the starting point is equal to the height of the XR equipment, and the calculated height data precision is very high.

The method has the advantages that under the condition that the height of the final peak value interval is unchanged, the number of times of executing the point counting step is reduced by dividing the interval for a plurality of times, the calculated amount is reduced, and meanwhile, the calculation accuracy is improved.

The advantage of this embodiment is that the XR device height collection method provided in this embodiment can also divide the environmental space of the XR device multiple times in the gravity direction, and by reducing the execution times of the point counting step, the calculation amount is reduced and the calculation accuracy is improved.

The foregoing describes in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be made in accordance with the concepts of the invention by one of ordinary skill in the art without undue burden. Therefore, all technical solutions which can be obtained by logic analysis, reasoning or limited experiments based on the prior art by the person skilled in the art according to the inventive concept shall be within the scope of protection defined by the claims.

Claims

1. An XR device height collection method, comprising:

a coordinate system construction step, namely, a world coordinate system is established, and the Z-axis direction and the gravity direction of the world coordinate system are on the same straight line;

a three-dimensional point cloud acquisition step, namely acquiring an environment image in real time through a binocular camera of an XR device, and acquiring a three-dimensional point cloud corresponding to the environment image under a world coordinate system;

a height partitioning step, with the followingA point on the XR device is taken as a starting point, and the world coordinate system is divided into a plurality of equal-height intervals R along the Z-axis direction _n N is a natural number;

counting the number of points, namely counting the ith interval R _i Number of points of internal distribution X _i I is a section number of n or less, and a series X is obtained _n The method comprises the steps of carrying out a first treatment on the surface of the The number of the X is set _n Modifying the number smaller than a preset threshold value to zero;

a peak value interval searching step, in which the number of the sequences X is in the direction from the large to the small of n _n Find the first local maximum X _imax Find and X _imax Corresponding peak interval R _imax The method comprises the steps of carrying out a first treatment on the surface of the And

a height calculating step of calculating the height of the XR device, which is the peak interval R _imax Distance from the starting point.

2. The XR device height collection method of claim 1, wherein,

before the coordinate system construction step, further comprising:

and initializing, namely acquiring parameters of a binocular camera of the XR equipment, and calculating a mapping matrix for binocular stereo correction according to the parameters.

3. The XR device height collection method of claim 1, wherein,

the three-dimensional point cloud acquisition step specifically comprises the following steps:

a data acquisition step of acquiring a binocular camera image of a current frame, including a first image and a second image;

a binocular stereo correction step, namely performing binocular stereo correction processing on the binocular camera image by using a mapping matrix to obtain a binocular camera corrected image; binocular stereo matching, namely performing binocular stereo matching on the corrected images of the binocular cameras to obtain parallax images of the first image and the second image; a three-dimensional point cloud computing step, namely computing a coordinate set of a three-dimensional point cloud under a binocular camera coordinate system corresponding to the parallax map according to the parallax map; and a coordinate conversion step of converting a coordinate set of the three-dimensional point cloud in the binocular camera coordinate system into the world coordinate system.

4. The XR device height collection method of claim 3,

in the binocular stereo correction step,

dst(x,y)＝src(map_x(x,y),map_y(x,y))

5. The XR device height collection method of claim 3, wherein the binocular stereo matching step comprises: a semi-dense optical flow calculation step of calculating a semi-dense optical flow according to a binocular camera image to obtain motion vectors of a plurality of pixel points in the binocular camera image;

a motion vector screening step of screening motion vectors in the horizontal direction from the motion vectors of the plurality of pixel points; and

and a parallax map construction step, namely constructing a parallax map of the binocular camera image according to the parallax value of each pixel point.

6. The XR device height collection method of claim 3, wherein in the three-dimensional point cloud computing step, coordinates (X, Y, Z) of any pixel point in the three-dimensional point cloud space are:

7. The XR device height collection method of claim 1, wherein,

in the step of calculating the height of the object,

the peak interval R _imax The distance from the starting point is equivalent to,

the peak interval R _imax And a difference between the Z value corresponding to the bisector in the Z axis direction and the Z value of the starting point.

8. The XR device height collection method of claim 1, wherein,

the height calculating step specifically comprises the following steps:

a secondary height partitioning step, which is divided into intervals R _imax-1 As a starting point, along the Z-axis direction, a peak interval R _imax And adjacent thereto the interval R _imax-1 And R is R _imax+1 A second section T divided into a plurality of equal heights _m M is a natural number;

a second point counting step of counting the ith interval T _i Number Y of pixel points distributed internally _i I is a second interval number less than or equal to m, and a second array Y is obtained _m The method comprises the steps of carrying out a first treatment on the surface of the The second array Y _m Modifying the number smaller than a preset threshold value to zero;

a secondary peak interval searching step, in which m is from large to small, in the number sequence Y _m Find the first local maximum Y _{m max} Find and match Y _{m max} Corresponding second peak interval T _{m max} The method comprises the steps of carrying out a first treatment on the surface of the And

a secondary height calculating step of calculating the second peak interval T _{m max} The difference between the Z value corresponding to the bisector in the Z-axis direction and the Z value of the starting point is equal to the peak section T _{m max} Distance from the starting point.

9. An XR device comprising

A memory to store executable program code; and

a processor to read the executable program code to run a computer program corresponding to the executable program code to perform at least one step of the XR device height collection method of any one of claims 1-8.

10. The XR device of claim 9, further comprising:

a housing including a circuit board for mounting the memory and the processor;

a binocular camera mounted to the housing and electrically connected to the processor; and

an IMU sensor is mounted to the housing and electrically connected to the processor.