CN110675346A

CN110675346A - Image acquisition and depth map enhancement method and device suitable for Kinect

Info

Publication number: CN110675346A
Application number: CN201910917492.3A
Authority: CN
Inventors: 吴怀宇; 洪运志; ***浩; 刘家乐; 李琳; 陈思文
Original assignee: Wuhan University of Science and Engineering WUSE
Current assignee: Wuhan University of Science and Engineering WUSE
Priority date: 2019-09-26
Filing date: 2019-09-26
Publication date: 2020-01-10
Anticipated expiration: 2039-09-26
Also published as: AU2020101832A4; CN110675346B

Abstract

The invention discloses an image acquisition and depth map enhancement method and device suitable for Kinect, belonging to the field of image processing and comprising the following steps: the method comprises the steps of acquiring data based on a depth camera, improving depth map cavity restoration based on an FMM algorithm and depth map edge noise smoothing, acquiring primary original data of a color depth map, and then performing format conversion on the primary original data to obtain original data of the color depth map; repairing large-area holes in the depth map by adopting a repairing mask generation method based on inverse threshold binarization aiming at the obtained depth map data and the holes in the depth map; and aiming at the depth image obtained by the hole restoration, further image enhancement processing is carried out on the depth image by adopting an image edge noise smoothing method based on median filtering. The invention can obviously remove the holes and noise in the depth map and further enhance the applicability and reliability of Kinect in computer vision.

Description

Image acquisition and depth map enhancement method and device suitable for Kinect

Technical Field

The invention belongs to the field of image processing, relates to a data preprocessing method based on a depth image, and more particularly relates to an image acquisition and depth image enhancement method and device suitable for Kinect.

Background

Depth information plays an important role in many computer vision applications, such as augmented reality, scene reconstruction, 3D television auxiliary sensors, etc. Advances in sensor technology have led to the emergence of many depth-sensitive cameras on the market. As a low-cost and real-time depth sensor, a representative product such as a Kinect depth sensor manufactured by Microsoft corporation is a structured light type camera sensor, and the structured light sensor measures multiple reflections, encounters abnormal reflections, light blockage and reflection obstacles, and causes a plurality of problems in the original Kinect depth map, and some areas lack depth data. And the error is an important problem influencing the Kinect data use and application development.

Therefore, before the image application of a depth camera such as a Kinect type, the data of the depth camera needs to be preprocessed, hole pixels are filled and image noise is smoothed, so that the reliability of the depth camera in the computer vision application is improved. Based on this, how to effectively perform data preprocessing based on depth images is a technical problem to be solved at present.

Disclosure of Invention

Aiming at the defects or improvement requirements of the prior art, the invention provides an image acquisition and depth map enhancement method and device suitable for Kinect, so that the technical problem of effectively preprocessing depth image data to fill hole pixels and smooth image noise is solved.

To achieve the above object, according to an aspect of the present invention, there is provided an image acquisition and depth map enhancement method for a Kinect, including:

(1) obtaining an original depth image, and performing format conversion on the original depth image to obtain a depth image in a target format;

(2) determining a target mask generation mode according to whether the region to be repaired in the depth image in the target format is located at the edge position of the image, and obtaining a mask of the region to be repaired in the depth image in the target format based on the target mask generation mode;

(3) combining the mask of the region to be repaired with a fast marching algorithm to fill the hole of the region to be repaired in the depth image of the target format to obtain a repaired depth image;

(4) and performing median filtering on the depth image after the cavity is repaired to remove image edge noise and obtain the depth image after image enhancement processing.

Preferably, step (2) comprises:

if the area to be repaired in the depth image in the target format is not located at the edge of the image, adding a mouse callback function to an interface function cvInpaint of OpenCv, and then setting a color threshold range according to the area to be repaired in the depth image in the target format to obtain a mask of the area to be repaired.

Preferably, step (2) comprises:

if the area to be repaired in the depth image of the target format is positioned at the edge of the image, the area to be repaired is positioned at the edge of the image

Determining an inverse binary thresholding function of a mask of a region to be repaired so as to set a color threshold range of the region to be repaired, and obtaining the mask of the region to be repaired, wherein threshold is a preset threshold, I (x, y) represents a pixel value at a pixel point (x, y) of the region to be repaired, and omega represents the region to be repaired.

Preferably, step (3) comprises:

by

Filling holes in the region to be repaired in the depth image of the target format, wherein the p point is an image needing to be repairedElement, D_pRepresenting depth values at p points, B_ε(p) denotes the neighborhood of p points, q being B_ε(p) and w (p, q) is used to measure the similarity of p points to the neighborhood pixel q, D_qRepresenting the depth value at point q, ▽ D_qRepresenting the luminance gradient value at point q and (p-q) representing the geometric distance between pixel p and pixel q.

Preferably, step (4) comprises:

and performing median filtering on the depth image after the cavity is repaired by g (x, y) ═ med { f (x-k, y-l), (k, l ∈ w) }, so as to obtain the depth image after image enhancement processing, wherein g (x, y) represents the image after median filtering, f (x, y) represents the depth image after the cavity is repaired, w represents a two-dimensional median filtering template, and k and l take the value in w.

According to another aspect of the present invention, there is provided an image acquisition and depth map enhancement apparatus for a Kinect, including:

the image format conversion module is used for acquiring an original depth image and performing format conversion on the original depth image to obtain a depth image in a target format;

the mask image generation module is used for determining a target mask generation mode according to whether the region to be repaired in the depth image in the target format is located at the edge position of the image or not, and obtaining a mask of the region to be repaired in the depth image in the target format based on the target mask generation mode;

the repairing module is used for combining the mask of the region to be repaired with a fast marching algorithm to fill the hole of the region to be repaired in the depth image of the target format to obtain a repaired depth image;

and the filtering module is used for carrying out median filtering on the depth image subjected to the cavity restoration so as to remove image edge noise and obtain the depth image subjected to image enhancement processing.

Preferably, the mask image generation module is configured to add a mouse callback function to an interface function cvInpaint of OpenCv when the area to be repaired in the depth image in the target format is not located at an edge of the image, and then set a color threshold range according to the area to be repaired in the depth image in the target format to obtain the mask of the area to be repaired.

Preferably, the mask image generating module is configured to generate the mask image when the region to be repaired in the depth image of the target format is located at an edge of the image

Preferably, the repair module is used for repairing a defect of a metal object

Filling holes in a region to be repaired in the depth image of the target format, wherein the p point is a pixel needing to be repaired, and D is_pRepresenting depth values at p points, B_ε(p) denotes the neighborhood of p points, q being B_ε(p) and w (p, q) is used to measure the similarity of p points to the neighborhood pixel q, D_qRepresenting the depth value at point q, ▽ D_qRepresenting the luminance gradient value at point q and (p-q) representing the geometric distance between pixel p and pixel q.

Preferably, the filtering module is configured to perform median filtering on the depth image after the cavity is repaired by g (x, y) ═ med { f (x-k, y-l), (k, l ∈ w) }, so as to obtain a depth image after image enhancement processing, where g (x, y) represents the image after median filtering, f (x, y) represents the depth image after the cavity is repaired, w represents a two-dimensional median filtering template, and k and l take the value in w.

In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects: the method can obtain the color and depth original images of the depth camera Kinect, and the obtained data can be applied to scene three-dimensional reconstruction, robot V-SLAM and the like; meanwhile, aiming at ubiquitous cavities and noises caused by the sensor and the environment in the depth map, the cavities and the noises in the depth map can be obviously removed by adopting the improved FMM (fast marching method) -algorithm-based cavity repairing method and the adopted median filtering-based noise smoothing method, and the applicability and the reliability of Kinect in the aspect of computer vision research are further enhanced.

Drawings

FIG. 1 is a schematic flow chart of a method provided by an embodiment of the present invention;

FIG. 2 is a schematic diagram of noise in a Kinect captured image according to an embodiment of the present invention;

FIG. 3 is a flowchart of depth map restoration and denoising according to an embodiment of the present invention;

fig. 4 is a schematic diagram of depth map hole repair of an improved FMM algorithm provided in an embodiment of the present invention, where (a) is a FMM algorithm repair principle, and (b) is a depth map repaired by using the improved FMM algorithm;

FIG. 5 is a diagram illustrating median filtering noise smoothing according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The invention provides an image acquisition and depth map enhancement method and device suitable for Kinect, and aims to design a set of data acquisition and preprocessing method flows suitable for a depth camera V-SLAM. Adopting a color depth map original data acquisition method based on Kinect _ SDK in Windows environment; an improved FMM algorithm is provided, and the depth map holes are repaired; then, a noise removal method based on median filtering is designed to smooth the edge noise of the depth map.

Fig. 1 is a schematic flow chart of an image acquisition and depth map enhancement method for a Kinect according to an embodiment of the present invention, where the method shown in fig. 1 includes the following steps:

s1: collecting and preprocessing a color depth image;

since the color depth map collected under the ROS platform is the image data generated by rqt or screen capture, and the collected depth map is not the original data (sixteen-bit single channel) of the depth map, in the embodiment of the present invention, an image collection method based on Kinect _ SDK under the Windows environment is adopted. And obtaining preliminary original data of the color depth map by means of development kit and drive of Kinect, and then carrying out data processing on the preliminary original data by using an image matrix processing tool Matlab to obtain the required original data of the color depth map.

In the embodiment of the present invention, step S1 may be implemented by the following steps:

s11: data acquisition: starting the PC and the depth camera Kinect, opening a drive package Kinect _ SDK of the Kinect camera, and testing whether image information can be normally sent to a development package; opening KinectExplorer-D2D.exe under the KinectSaver folder;

s12: selecting the type of the saved file: on the basis of step S11, an official specification file format of the color depth map needs to be generated. After the Kinect _ SDK is opened, some default image storage formats exist, in the embodiment of the invention, the color image needs to be stored as a three-channel 8-bit png format picture, and the depth image needs to be stored as a single-channel 16-bit png format picture; firstly, after a KinectExplorer-D2D.exe running interface appears, a lower right menu bar selects recording, a color map selects an image format and a depth map selects a binding format.

S13: and (3) customizing a folder and a path, and respectively storing the color depth maps: the Kinect v1 depth camera has an RGB color camera, an infrared CMOS camera and an infrared emitter. Its color map and depth map information are distributed independently into the Kinect _ SDK. After the initial original data of the color depth map are obtained in two independent folders on a Windows platform, a series of data streams can be obtained. In the embodiment of the present invention, the obtained color data stream is a series of images in the bmp format at different times, and the obtained depth data stream is binary data in the binary format at the same time, which generally cannot directly display image information. Therefore, the preliminary raw data needs to be processed to obtain a color depth raw image.

S14: and (3) carrying out format conversion and processing on the original data by utilizing Matlab: after the data in the format of bmp and binary is obtained in step S13, the color image is converted into a three-channel 8-bit image in png, a Matlab script is written in the depth image, the data stream in binary is split and subjected to format conversion, and a single-channel 16-bit image in png is obtained through processing.

S2: improving the depth map hole repair of the FMM algorithm;

as shown in fig. 2, Kinect employs structured light imaging to obtain depth information of a field of view, and a depth image often has missing values. When the surface of an object is smooth or the color of the surface of the object is dark, the light reflected into the camera is weak due to specular reflection or light absorption of the object, so that detection fails, and black holes appear on a corresponding depth image, so that the pixel value is lost. To ensure the accuracy of feature positioning in the V-SLAM, the black hole noise must be filled up first, i.e., the depth image is repaired, which is significant for the practical application of the depth map.

In the embodiment of the present invention, step S2 may be implemented as follows:

s21: representation of images in computer:

in a definition mode of a traditional pixel coordinate system, an origin of a pixel coordinate system is located at the upper left corner of an image, an X axis is towards the right, a Y axis is downward in a gray scale image, and each pixel position (X, Y) corresponds to a gray scale value I, so that an image with a width w and a height h can be recorded as a matrix in a mathematical form, as shown in formula (1).

I(x,y)∈R^w×h(1)

In the depth map of the depth camera, the distance of each pixel from the camera, i.e. d in (u, v, d), is also recorded. This distance is usually in millimeters, e.g., the Kinect range is usually around ten meters, exceeding a maximum value range of 0-255. Therefore, the original format of the depth image in the computer is generally expressed by sixteen-bit integers, and the first twelve bits of the sixteen-bit data need to be extracted during specific application, so as to obtain the real depth information of the image.

S22: FMM algorithm image restoration principle:

FIG. 4(a) shows the hole repairing principle of depth map, and the Ω region is the region to be repaired; δ Ω denotes the boundary of Ω; to repair a pixel in Ω, a new pixel value needs to be calculated to replace the original value. The p-point is the pixel that needs repair. Selecting a small neighborhood B with p as center_ε(p) the pixel values of the points in the neighborhood are known, so that the pixel value of a p point can be estimated from the valid pixels in its neighborhood. Wherein epsilon is a parameter inpaintRadius in an Opencv function, and q is B_ε(p) a gray scale value formula for calculating p from the q points as shown in the following formula (2), ▽ D (q) is a gray scale gradient of the q points;

D(p)＝D(q)+▽D(q)(p▽q) (2)

then needs to use neighborhood B_ε(p) calculating all the points to obtain new gray values of the p points, and then introducing a weighting function to judge which points have more obvious decision effect on the gray values of the new pixels, because the effect of each point on the gray values of the new pixels is generally different. The following processing is as shown in equation (3):

wherein D is_qRepresenting the depth value at point q, representing the pixel gradient at point q, ▽ D_qThe luminance gradient value of a q point is represented, (p-q) represents the geometric distance between a pixel p and a pixel q, and w (p, q) is a weight function, is used for measuring the similarity degree of the p point and a neighborhood pixel q, and is used for determining an influence factor of each pixel in a neighborhood on a gray value to be calculated. The weight function is explained by the following formula (4):

w(p,q)＝dir(p,q)·dst(p,q)·lev(p,q) (4)

wherein, the explanation of each amount in the formula (4) is as in the formula (5-7).

Wherein d is₀And T₀The distance parameter and the level set parameter are respectively expressed and generally 1, the direction parameter dir (p, q) ensures that the pixel points closer to the normal direction N, which is ▽ T, have the maximum influence on the p point, the geometric distance factor dst (p, q) ensures that the pixel points closer to the p point have larger contribution to the p point, the level set distance parameter lev (p, q) ensures that the pixel points closer to the contour line of the region to be repaired passing through the p point have larger influence on the p point, N (p) represents the size of the neighborhood window of the repaired pixel point, and T (p), T (q) represent the distance from the p point to the neighborhood boundary δ Ω.

The image restoration is initially realized by an interface function cvInpaint of OpenCv, and the prototype of the cvInpaint function is as follows:

void inpaint(InputArray src,InputArray inpaintMask,

OutputArray dst,double inpaintRadius,int flags)；

a parameter src, an input single-channel or three-channel image; the size of a parameter inpaint mask, a mask of an image and a single-channel image is consistent with that of an original image, and the pixel values of other parts except for the part needing to be repaired on the inpaint mask image are all 0; a parameter dst, the output restored image; the inpainradius parameter, the neighborhood radius taken by the repair algorithm, is used for calculating the difference of the current pixel point; parameter flags, repair algorithm constant parameter.

However, because the cvInpaint function is directed at color image restoration, the invention improves the original FMM algorithm according to the gray level characteristics of the depth map, and particularly relates to a generation part of a restoration mask in the original FMM algorithm, and two mask (inpainmask) generation methods are designed.

S23: generating a mask for image restoration;

firstly, after the original data acquired by the Kinect is converted and preprocessed, an original image of a color depth map is obtained. As shown in fig. 3, it is determined whether or not a hole in the depth map is located at an edge position of the image, and manual mask generation (selective mask generation) is used. The manual mask repairing generation is realized by adding a mouse event and threshold processing on the basis of an FMM algorithm, and the automatic mask generation method is a mask generation method obtained after performing inverse binary thresholding on the mask generation.

S231: generating a selective mask;

when the hole noise of the depth map is located in the non-edge area of the image, a manual mask generation (selective mask generation) method is adopted, the specific processing method is that a mouse callback function (void mouse callback) is added to an original cvipainpad function, then a threshold range is set according to the hole of the depth map (between 0 and 255), and two representative color thresholds are set in the embodiment of the invention as follows: white (235-.

S232: thresholding mask generation based on the inverse binary;

when the hole noise of the depth map is located in the image edge area, the hole is repaired by adopting an automatic mask generation method (mask generation based on inverse binary thresholding), and at the moment, the edge protection of the depth map is better. D Ω represents a repair mask of the function in OpenCv, and d Ω (x, y) is an inverse binary thresholding function for determining the mask of the region to be repaired, as shown in equation (8).

In the formula (8), the depth image is binarized according to the threshold range of the hole noise of the depth image, that is, the pixels with the gray value not greater than the threshold are set to be 255, and the rest are set to be 0, so that the mask of the region to be repaired of the image is obtained. Then, the image is repaired by combining the image into an FMM algorithm.

In the embodiment of the present invention, the threshold is set to 35, and what kind of value is specifically adopted may be determined according to actual needs, and the embodiment of the present invention is not limited uniquely.

S24: combining the image repairing mask with an FMM algorithm to repair the cavity;

after an image mask is generated on the depth map to be repaired, the hole can be filled by combining a fast marching algorithm, and the hole repairing effect of the embodiment of the invention is shown in fig. 4 (b).

S3: removing edge noise of the depth map by median filtering;

the median filtering is a non-linear filtering, which is convenient because it does not require the statistical properties of the image in the actual operation process. Median filtering is applied first in one-dimensional signal processing techniques and later in two-dimensional image signal processing techniques. Under certain conditions, the image detail blurring caused by a linear filter (such as neighborhood averaging) can be overcome, and the method is most effective for filtering pulse interference and image scanning noise. However, the median filtering method is not suitable for some images with much details, especially for images with much point, line and pinnacle details. The basic principle of median filtering is to replace the value of a point in a digital image or sequence of numbers by the "median" of the values of the points in a neighborhood of the point. "median" means that the gray values in a neighborhood are arranged in a sequence from large to small (or vice versa), and the median of the sequence is the number in the middle.

In image processing, a noise removal method for filtering an image is generally expressed by equation (9):

in the formula (9), I is the image after noise filtering, and Ω is a neighborhood range of the pixel (x, y), generally a rectangular region with (x, y) as the center; w (i, j) is the weight of the filter at (i, j); w_pIs a unique parameterAnd I is an image containing noise. Wherein:

s31: median filtering of the image;

median smoothing is a neighborhood operation, and an image I is set as a two-dimensional sequence { x }_i,jWhen median filtering is performed, the filtering window is also two-dimensional, but the two-dimensional window can have various shapes, such as a line shape, a square shape, a circular shape, a cross shape, a circular shape, and the like. The median filtering of the two-dimensional data is represented by equation (11):

as shown in formula (11), assuming that the height of a certain image is H and the width is W, for an arbitrary position (x, y) in the image, x is greater than or equal to 0 and less than or equal to t; y is more than or equal to 0 and less than or equal to t; then, taking a neighborhood with (x, Y) as a center, a width of i and a height of j, wherein i and j are both odd numbers, sorting pixel points in the neighborhood, and then taking a median value from each other as a pixel value of an output image Y at (x, Y).

Specifically, in the embodiment of the present invention, the median filtering is performed on the depth image after the hole repairing by equation (12) to obtain the depth image after the image enhancement processing;

g(x,y)＝med{f(x-k,y-l),(k,l∈w)} (12)

wherein g (x, y) represents an image after median filtering, f (x, y) represents a depth image after hole repairing, w represents a two-dimensional median filtering template, and k and l take the value in w.

S32: selecting the size of a filtering window;

furthermore, the filtering parameters are adjusted to be suitable for smoothing of the depth image, and the filtering parameters with better effect are selected by carrying out multiple groups of experiment comparison and analysis. The method aims at the fact that the proper adjustment of the filtering window parameter winsize can improve the black hole filling effect, but meanwhile, when the winsize is increased to a certain range, the edge of a depth image is blurred, and image details are easy to lose.

It should be noted that when the window is actually used, the size of the window is preferably not larger than the size of the smallest effective object in the image, and generally 3 × 3 is used and then 5 × 5 is used to gradually increase until the filtering effect is satisfactory. For images with slowly changing longer outline objects, a square or circular window is suitable, and for images containing sharp-top angle objects, a cross-shaped window is suitable.

S33: calculating a median value of the gray scale by neighborhood operation;

s331: judging the image boundary: first, setting the height and width of the filtering window to winH, winW (already introduced in step S31, both are odd), half of them are: halfWinH ═ winH-1/2, and halfWinW ═ winW-1/2. Calculating the boundary standard value x-halfWinH <0 or more than or equal to 0; y-halfWinW is less than 0 or is more than or equal to 0; the height and width of the image are recorded as h, w, and it is calculated whether x + halfWinH > h-1 and y + halfWinW > w-1 are true.

S332: taking a neighborhood: on the basis of step S331, the gray level neighborhood to be compared is marked as R,

then there are:

R＝[max(x-halfWinH,0):min(h-1,x+halfWinH),max(y-halfWinW,0):min(w-1,y+halfWinW)]

where max and min are the functions of getting larger after comparison and getting smaller after comparison, respectively.

S333: calculating the median value: and (4) combining the neighborhood in the step S332, and calculating a median of the gray values according to the formula (9) in the step S31.

For the depth map with the filled holes, the edge noise and the micro burr holes of the depth map are enhanced smoothly by using image median filtering, and the noise restoration map acquired by the Kinect in the embodiment has the effect as shown in FIG. 5.

In the embodiment of the invention, the data acquisition and processing are processed in a PC (personal computer), a system platform is based on a Windows operating system, a depth camera sensor is Kinect v1 produced by Microsoft corporation, and a matched software development interface Kinect _ SDK for Windows is utilized; the data processing compiling environment of the color depth map is Visual Studio C + +11, an OpenCV data processing library with version 2.4.9 and a large matrix operation processing software Matlab with version 2013 a.

The invention discloses an image acquisition and depth map enhancement method and device suitable for Kinect, and relates to the field of image processing and robot vision. The method mainly comprises three parts: the depth camera data acquisition method based on the Kinect _ SDK improves depth map cavity restoration and depth map edge noise smoothing based on an FMM algorithm. The first part comprises the initial raw data acquisition of the color depth map, and then the format conversion is carried out on the initial raw data to obtain the raw data of the color depth map; and in the second part, aiming at the depth map data obtained in the previous step and the holes existing in the depth map, the original FMM (fast marching method) algorithm is improved, and a repairing mask generating method based on inverse threshold binarization is provided for repairing the large-area holes in the depth map. And in the third part, aiming at the depth image obtained by the hole repairing in the previous step, an image edge noise smoothing method based on median filtering is adopted to perform further image enhancement processing on the depth image. According to the method, the color and depth original images of the depth camera Kinect can be obtained firstly, and the obtained data can be applied to scene three-dimensional reconstruction, robot V-SLAM and the like; meanwhile, aiming at ubiquitous cavities and noises caused by the sensor and the environment in the depth map, an improved FMM (fast marching method) -based cavity repairing method and an adopted median filtering-based noise smoothing method are designed, the cavities and the noises in the depth map can be obviously removed, and the applicability and the reliability of Kinect in the aspect of computer vision research are further enhanced.

Fig. 6 shows a schematic structural diagram of an apparatus according to the present invention, which includes:

the repairing module is used for filling holes in the to-be-repaired area in the depth image in the target format by combining the mask of the to-be-repaired area with a fast advancing algorithm to obtain a repaired depth image;

and the filtering module is used for carrying out median filtering on the depth image subjected to the cavity restoration so as to remove the edge noise of the image and obtain the depth image subjected to image enhancement processing.

The specific implementation of each module may refer to the description in the method embodiment, and the embodiment of the present invention will not be repeated.

In another embodiment of the present invention, a computer readable storage medium is further provided, on which program instructions are stored, and when executed by a processor, the method for image acquisition and depth map enhancement for a Kinect as described in any one of the above is implemented.

It should be noted that, according to the implementation requirement, each step/component described in the present application can be divided into more steps/components, and two or more steps/components or partial operations of the steps/components can be combined into new steps/components to achieve the purpose of the present invention.

The above-described method according to the present invention can be implemented in hardware, firmware, or as software or computer code storable in a recording medium such as a CD ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or as computer code originally stored in a remote recording medium or a non-transitory machine-readable medium and to be stored in a local recording medium downloaded through a network, so that the method described herein can be stored in such software processing on a recording medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware such as an ASIC or FPGA. It will be appreciated that the computer, processor, microprocessor controller or programmable hardware includes memory components (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the processing methods described herein. Further, when a general-purpose computer accesses code for implementing the processes shown herein, execution of the code transforms the general-purpose computer into a special-purpose computer for performing the processes shown herein.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. An image acquisition and depth map enhancement method suitable for Kinect is characterized by comprising the following steps:

2. The method of claim 1, wherein step (2) comprises:

3. The method of claim 1, wherein step (2) comprises:

4. The method of claim 2 or 3, wherein step (3) comprises:

by

5. The method of claim 4, wherein step (4) comprises:

6. An image acquisition and depth map enhancement device suitable for Kinect is characterized by comprising:

7. The apparatus according to claim 6, wherein the mask image generation module is configured to add a mouse callback function to an interface function cvInpaint of OpenCv when the area to be repaired in the depth image in the target format is not located at an edge of the image, and then set a color threshold range according to the area to be repaired in the depth image in the target format to obtain the mask of the area to be repaired.

8. The apparatus according to claim 6, wherein the mask image generation module is configured to generate the mask image when the region to be repaired in the depth image of the target format is located at an edge of the image

Determining an inverse binary thresholding function of a mask of a region to be repaired so as to set a color threshold range of the region to be repaired, and obtaining the mask of the region to be repairedWherein threshold is a preset threshold, I (x, y) represents a pixel value at a pixel point (x, y) of the region to be repaired, and Ω represents the region to be repaired.

9. The apparatus of claim 7 or 8, wherein the repair module is configured to be replaced by

10. The apparatus according to claim 9, wherein the filtering module is configured to perform median filtering on the depth image after hole repairing by g (x, y) ═ med { f (x-k, y-l), (k, l ∈ w) }, so as to obtain an image-enhanced depth image, where g (x, y) represents the median filtered image, f (x, y) represents the depth image after hole repairing, w represents a two-dimensional median filtering template, and k and l take the value of w.