CN109118493B

CN109118493B - Method for detecting salient region in depth image

Info

Publication number: CN109118493B
Application number: CN201810757983.1A
Authority: CN
Inventors: 李捷; 周宏扬; 袁夏; 赵春霞
Original assignee: Nanjing University of Science and Technology
Current assignee: Nanjing University of Science and Technology
Priority date: 2018-07-11
Filing date: 2018-07-11
Publication date: 2021-09-10
Anticipated expiration: 2038-07-11
Also published as: CN109118493A

Abstract

The invention discloses a method for detecting a salient region in a depth image. Firstly, calculating the difference between each pixel point and adjacent points in a depth map to obtain the gradient characteristic of each pixel point; then, an initial saliency map is obtained by gradient feature calculation by adopting a global contrast method; then, carrying out peak-valley detection and zero-parallax region estimation according to the histogram statistical characteristics of the depth map, and implementing division of a background region and a foreground region, thereby inhibiting the significance of a background part and keeping the significance of the foreground part; and finally, further optimizing by using the extended super-pixel division method to obtain a final salient region detection image. The method effectively inhibits the significance of the background part and optimizes the significance of the foreground part by dividing the background area and the superpixel, provides reliable significant areas for target detection, target identification, scene understanding and the like, and improves the acquisition capability of the image interesting area.

Description

Method for detecting salient region in depth image

Technical Field

The invention relates to a region detection method technology, in particular to a method for detecting a salient region in a depth image.

Background

Depth information is one of important information channels of human visual attention mechanisms, and can help people to quickly find interested areas in complex scenes.

With the progress of sensor technology, the acquisition of depth images is easier, and how to understand a three-dimensional scene represented by the depth images is an important problem to be solved in the fields of intelligent robot navigation, environment modeling, somatosensory games and the like. The salient regions may be used to guide the robot in discovering potential objects in the scene and reduce the computational load of environmental understanding.

The currently common method for calculating the significant area of the depth map is based on the global contrast of the depth value, the depth value is directly used for calculation, and the method is easily interfered by noise data in the depth map, so that the detection result is inaccurate. Furthermore, a change in the depth value range of the different depth map makes the detection result unstable.

Disclosure of Invention

The invention aims to provide a method for detecting a salient region of a depth image, which can accurately and stably estimate the salient region.

The technical solution for realizing the purpose of the invention is as follows: a salient region detection method of a depth image comprises the following steps:

step 1, aiming at each pixel I in the depth image I_kSeparately extracting gradient features

Step 2, calculating an initial significance value S (I) of each pixel by adopting a global contrast calculation mode according to the gradient features extracted in the step 1_k) Obtaining an initial saliency map of the same resolution;

step 3, utilizing the histogram statistical characteristics of the depth map to detect wave crests and wave troughs;

step 4, estimating a zero parallax area ZPA of the depth image;

step 5, dividing a background area and a foreground area in the depth image according to the peak and trough detection result and the zero parallax area ZPA, adjusting the saliency map obtained in the step 2 according to the background area and the foreground area, and inhibiting the saliency value of the background area to obtain an improved saliency map;

and 6, performing superpixel segmentation on the original image by adopting a superpixel segmentation algorithm, and then optimizing the saliency map obtained in the step 5 to obtain a final saliency area.

Compared with the prior art, the invention has the following remarkable advantages: (1) the method adopts the gradient characteristic of the depth value as the calculation basis of the overall contrast, is less interfered by noise and is not influenced by the change of the range of the depth value, so that the result is more accurate; (2) the invention effectively inhibits the significance of the background and the salient target area by utilizing the method of dividing the background area, and improves the stability of the detection result.

The invention is further described below with reference to the accompanying drawings.

Drawings

Fig. 1 is a flowchart of a salient region detection method of a depth map according to the present invention.

Fig. 2 is a schematic diagram of salient region detection according to an embodiment of the present invention, in which a diagram (a) is an original image, a diagram (b) is a depth image, and a diagram (c) is a detection result diagram.

Detailed Description

A method for detecting a salient region in a depth image comprises the following steps:

In a further embodiment, extracting the gradient feature of the depth map specifically includes the following steps:

step 1-1, traversing all pixel points of the depth image I to obtain a gradient vector of each pixel point, and counting the pixel points I^kN, N is the total number of pixels, and its gradient vector (dr) is^k,dc^k) The calculation formula is as follows:

dr^k＝(dep(r+1,c)-dep(r-1,c)/2 (1)

dc^k＝(dep(r,c+1)-dep(r,c-1)/2 (2)

wherein r and c correspond to rows and columns of image coordinates, dep (r, c) represents the depth value of the r-th row and c-th column in the depth image I;

step 1-2, traversing all pixel points to obtain the gradient characteristic of each point, and obtaining a pixel point I^kCharacteristic of gradient of

The method specifically comprises the following steps:

where ε is a constant greater than zero and Maximun is specified as

And

is measured.

in a further embodiment, an initial saliency value S (I) of each pixel is calculated by a global contrast method according to the gradient characteristics_k) Obtaining an initial saliency map of the same resolution, specifically comprising the following steps:

step 2-1, normalizing two elements of all gradient feature vectors to an interval [0, 255%]Rounding off to obtain integer value and pixel point I^kAfter being normalized, the gradient features of

Thereby the gradient characteristic values of all pixel points

All correspond to [0,255]An integer of up to 256 different values, noted

The same can be obtained

Step 2-2, obtaining characteristic values according to the calculation mode of the global contrast

Corresponding significance values:

where n is 256, the total number of feature values extracted from the depth image, and f_jRepresents

The probability of occurrence in the image is,

is composed of

And

a distance metric function of the two features; for characteristic value

In the same way, the corresponding significance values are obtained:

step 2-3, the corresponding significance values of the pixels with the same characteristic value are also the same, and the pixel I is subjected to^kIf its characteristic value is

The initial saliency value for that pixel is then:

wherein, w_aAnd w_bIs a weight parameter; for each pixel, according to the characteristic value of the pixel, the significance value of the pixel can be obtained, and therefore the initial significance map of the full resolution is obtained.

in a further embodiment, the peak-valley detection is performed by using the histogram statistical features of the depth map, and the steps are as follows:

step 3-1, dividing the depth values of all pixels in the depth map into 256 intervals, and counting the number of the pixels of the depth values in each interval range to obtain a statistical histogram;

step 3-2, calculating a derivative of the histogram statistic value to obtain the growth rate of each position corresponding to the abscissa of the histogram, and forming a vector alpha ═ alpha₁,α₂,…,α₂₅₆}；

Step 3-3, taking alpha_iSymbol λ of 1,2, …, 256_αiAnd forming them into a vector lambda in order_α＝{λ_α1,λ_α2,…,λ_α256}，α_iSign of (a)_αiThe concrete formula of (1) is as follows:

step 3-4, vector λ_αCarrying out mean value filtering, and executing the operation of the step 3-3 on the filtered result to obtain a new digital string lambda_β＝{λ_β1,λ_β2,…,λ_β256}；

Step 3-5, vector lambda is aligned by adopting a template matching mode_βCarrying out jump detection; there are 4 types of hopping: [1, -1]，[1,0,-1]The jump position is the peak position P_p；[-1,1]，[-1,0,1]Corresponding to the position P of the wave trough_t。

Step 4, estimating a Zero Parallax Area (ZPA) of the depth map;

in a further embodiment, the specific step of estimating the zero-disparity region ZPA of the depth image is as follows:

step 4-1, calculating the median of the depth values in the depth image, i.e.

And 4-2, taking the median as a center, wherein the area within the range of the distance between the front and the back of the center and the distance between the front and the back of the center is ZPA of the scene:

in equation (9), H is the depth-of-field (DOF) of the scene, and σ is the scale parameter.

And 5, dividing a background area and a foreground area in the depth image, adjusting the initial saliency map obtained in the step 2 according to the background area and the foreground area, inhibiting the saliency value of the background area, and obtaining an improved saliency map.

In a further embodiment, the improved saliency map is obtained by the following steps:

step 5-1, determining a depth value corresponding to a peak-valley position which is behind the zero parallax zone ZPA and is closest to the zero parallax zone ZPA, namely a final threshold value T of the background estimation:

T＝min(p),st p∈{P_p,P_tand p > ZPA (10)

Step 5-2, in the depth image, taking an area with a depth value larger than a background threshold value T as a background part, and taking a part with a depth value smaller than T as a foreground area, and thus determining whether a pixel at a corresponding position in the saliency map belongs to the background part or the foreground part; suppressing the significance value of the background part in the significance map, and reserving the significance value of the foreground part in the significance map to obtain an improved significance map, wherein the suppression formula of the significance value of the background part is as follows:

in the formula, dep^kIs a pixel point I^kCorresponding to the depth value on the depth image, S (I)^k) Is the initial saliency value, S' (I) of the background portion^k) The significance value of the background part after inhibition is shown.

Step 6, performing superpixel segmentation on the original image by adopting a superpixel segmentation algorithm, and then optimizing the saliency map obtained in the step 5 to obtain a final saliency area;

in a further embodiment, the step of optimizing the saliency map in step 5 based on the super-pixel pairs is as follows:

step 6-1, initializing a clustering center: setting the number of superpixels as C, length in two-dimensional space

For the interval, periodically sampling the depth image, taking each sampling point as an initial clustering center, setting the category labels of all the pixels of the initial clustering centers to be 1,2, … and C, setting the category labels of all the pixels of the non-clustering centers to be-1, setting the distance between the pixels of the non-clustering centers and the clustering centers to be infinite, and setting N to be the total number of the pixels in the whole depth image;

step 6-2, for each clustering center I^cC, respectively calculating the cluster center and each pixel point I in the 2s × 2s neighborhood search range of the cluster centerⁱ1, 2., a distance of 2s × 2s, the distance calculation formula is as follows:

wherein dep^cFor clustering central pixel point I^cDepth value of u^c,ν^cIs I^cAbscissa and ordinate in the image; depⁱIs a pixel point IⁱDepth value of uⁱ,νⁱIs IⁱThe horizontal and vertical coordinates in the image, m is the compactness adjusting parameter of the super pixel;

each non-clustering center pixel point is searched by a plurality of surrounding clustering center points, the clustering center corresponding to the minimum distance value is taken as the clustering center of the pixel point, and the clustering center is set as a category label same as the clustering center, so that a super-pixel segmentation result is obtained;

6-3, calculating the depth mean value and the horizontal and vertical coordinate mean values of the pixel points in each super pixel, taking the depth mean value and the coordinate mean value of each super pixel as a new clustering center of the super pixel, and repeating the step 6-2 until the clustering center to which each pixel point belongs does not change any more;

step 6-4, counting the number of pixels contained in each super pixel, and merging the number of pixels with the super pixel with the nearest coordinate position in the adjacent super pixels when the number of pixels is smaller than a set minimum value e; after combination, all the super-pixels R are obtained^cC ═ 1,2,. C ', where C' ≦ C;

6-5, according to the super pixel R^cOptimizing the significance result obtained in step 5, i.e. if I^k∈R^cThen the pixel I^kFinal significance value of S ″ (I)^k) Comprises the following steps:

wherein, | R^cIs at the super-pixel R^cThe number of pixels contained in (a).

The present invention is further illustrated by the following specific examples.

Example 1

As shown in fig. 1, a method for detecting a salient region in a depth image includes the following steps:

step 1, for a depth image I, for each pixel I in the image_kSeparately extracting gradient features

The original image is shown in fig. 2 (a), and the depth image is shown in fig. 2 (b);

dr^k＝(dep(r+1,c)-dep(r-1,c)/2 (1)

dc^k＝(dep(r,c+1)-dep(r,c-1)/2 (2)

The method specifically comprises the following steps:

wherein ε is 0.02 and Maximun is G_aAnd G_bMaximum value of (a), maximum 600 in this example;

step 2, calculating an initial significance value S (I) of each pixel by adopting a global contrast calculation method according to the gradient characteristics in the step 1_k) Obtaining an initial saliency map of the same resolution, specifically comprising the following steps:

Thereby the gradient characteristic values of all pixel points

All correspond to [0,255]An integer of up to 256 different values, noted

The same can be obtained

Corresponding significance values:

The probability of occurrence in the image is,

is composed of

And

a distance metric function of the two features; for characteristic value

In the same way, the corresponding significance values are obtained:

The initial saliency value for that pixel is then:

wherein, w_aAnd w_bAre weight parameters, all set to 0.5; for each pixel, according to the characteristic value of the pixel, the significance value of the pixel can be obtained, and therefore an initial significance map of the full resolution is obtained;

step 3, utilizing the histogram statistical characteristics of the depth map to detect wave crests and wave troughs, and comprising the following steps:

Step 3-3, taking alpha according to a formula (8)_iSign of (a)_αiAnd forming them into a vector lambda in order_α＝{λ_α1,λ_α2,…,λ_α256}：

Step 3-4, vector λ_αCarrying out mean value filtering, repeating the operation of the step 3-3 once on the filtered result to obtain a new digital string lambda_β＝{λ_β1,λ_β2,…,λ_β256}；

Step 3-5, vector lambda is matched in a template matching mode_βCarrying out jump detection; there are 4 types of hopping: [1, -1]，[1,0,-1]The jump position is the peak position P_p；[-1,1]，[-1,0,1]Corresponding to the position P of the wave trough_t。

Step 4, estimating a Zero Parallax Area (ZPA) of the depth image, which comprises the following steps:

step 4-1, calculating the median of the depth values in the depth image, i.e.

And 4-2, determining areas around the median as ZPA of the scene:

in equation (9), H is the depth-of-field (DOF) of the scene, and σ is 0.1, which is a scale parameter.

Step 5, dividing a background area and a foreground area in the depth image, adjusting the saliency map obtained in the step 2 according to the background area, suppressing the saliency value of the background area, and obtaining an improved saliency map, wherein the steps are as follows:

T＝min(p),st p∈{P_p,P_tand p > ZPA (10)

in the formula, dep^kIs a pixel point I^kCorresponding to the depth value on the depth image, S (I)^k) Is the initial saliency value, S' (I) of the background portion^k) For background partial significance value after suppression

And 6, further improving and obtaining a final significant region detection result by adopting a method based on super-pixel division, wherein the steps are as follows:

step 6-1, initializing a clustering center: setting the number of superpixels as the number of superpixels C1600 and the length in the two-dimensional space in the whole depth image

Periodically sampling the depth image for intervals, taking each sampling point as an initial clustering center, setting the class labels of all pixels in the initial clustering centers to be 1,2, … and C, setting the class labels of all pixels in non-clustering centers to be-1, setting the distance between the pixels and the clustering centers to be infinite, setting N to be the total number of pixels in the whole depth image, and for a typical depth image with the resolution of 640 multiplied by 480, corresponding interval length to the depth image is equal to

Step 6-2, for each clustering center I^cC, calculating the cluster center and each pixel point I in the 28 × 28 neighborhood search range respectivelyⁱ1, 2., a distance of 2s × 2s, the distance calculation formula is as follows:

wherein dep^cFor clustering central pixel point I^cDepth value of u^c,ν^cIs I^cAbscissa and ordinate in the image; depⁱIs a pixel point IⁱDepth value of uⁱ,νⁱIs IⁱAbscissa and ordinate in the image; the compactness adjusting parameter m of the super pixel is 40;

6-3, calculating the depth mean value and the horizontal and vertical coordinate mean values of the pixel points in each super pixel, taking the depth mean value and the coordinate mean value of each super pixel as a new clustering center of the super pixel, and repeating the step 6-2 until the clustering center to which each pixel point belongs does not change any more; in the embodiment, 10 iterations can obtain ideal effects on most pictures, so that 10 iterations are selected;

step 6-4, setting the minimum value e of the number of pixels contained in the superpixel to be 20, and combining a morphological region smaller than e with a neighborhood thereof; after combination, all the super-pixels R are obtained^cC ═ 1,2,. C ', where C' ≦ C;

wherein, | R^cIs at the super-pixel R^cThe number of pixels contained in (a). As shown in fig. 2 (c), the closer the area in the graph is to white, the higher the saliency value of the area is, and the closer to black, the lower the saliency value is.

The invention adopts the gradient characteristic of the depth value as the calculation basis of the overall contrast, reduces noise interference, is not influenced by the change of the range of the depth value, and improves the accuracy of the detection result; by using a background area division method, the significance of a background and a highlighted target area is effectively inhibited, and the stability of a detection result is improved; the saliency of the foreground part is optimized by dividing the superpixels, so that a saliency map of a full resolution is obtained through calculation, a reliable saliency region is provided for target detection, target identification, scene understanding and the like, and the acquisition capability of an image region of interest is improved.

Claims

1. A method for detecting a salient region in a depth image is characterized by comprising the following steps:

step 4, estimating a zero parallax area ZPA of the depth image;

2. The method according to claim 1, wherein the extracting gradient features of the depth map in step 1 specifically comprises the following steps:

dr^k＝(dep(r+1,c)-dep(r-1,c)) /2 (1)

dc^k＝(dep(r,c+1)-dep(r,c-1)) /2 (2)

The method specifically comprises the following steps:

where ε is a constant greater than zero and Maximun is specified as

And

is measured.

3. The method for detecting the salient region in the depth image according to claim 1, wherein in step 2, an initial saliency value S (I) of each pixel is calculated by adopting a global contrast method according to gradient features_k) Obtaining an initial saliency map of the same resolution, specifically comprising the following steps:

Thereby the gradient characteristic values of all pixel points

All correspond to [0,255]An integer of up to 256 different values, noted

The same can be obtained

Corresponding significance values:

The probability of occurrence in the image is,

is composed of

And

a distance metric function of the two features; for characteristic value

In the same way, the corresponding significance values are obtained:

The initial saliency value for that pixel is then:

4. The method for detecting the significant region in the depth image according to claim 1, wherein the step 3 performs peak-valley detection by using histogram statistical features of the depth image, and comprises the following steps:

Step 3-5, vector lambda is aligned by adopting a template matching mode_βCarrying out jump detection; a total of 4Hopping type: [1, -1]，[1,0,-1]The jump position is the peak position P_p；[-1,1]，[-1,0,1]Corresponding to the position P of the wave trough_t。

5. The method according to claim 2, wherein the step 4 of estimating the zero-disparity region ZPA of the depth image comprises the following specific steps:

step 4-1, calculating the median of the depth values in the depth image, i.e.

And 4-2, taking the median as a center, wherein an area within a range of the distance between the front and the back of the center and the distance between the front and the back of the center are zero parallax areas ZPA of the depth image:

in the formula, H is the depth of field of the scene, and σ is a proportional parameter.

6. The method according to claim 4, wherein in step 5, the background area and the foreground area in the depth image are divided, and the saliency map obtained in step 2 is adjusted accordingly to suppress the saliency value of the background area, so as to obtain an improved saliency map, and the steps are as follows:

T＝min(p),st p∈{P_p,P_tand p > ZPA (10)

7. The method for detecting the salient region in the depth image according to claim 1, wherein the original image is subjected to superpixel segmentation by using a superpixel segmentation algorithm in step 6, and then the salient image obtained in step 5 is optimized to obtain a final salient region, and the specific steps are as follows:

wherein dep^cFor clustering central pixel point I^cDepth value of u^c,ν^cIs I^cAbscissa and ordinate in the image; depⁱIs a pixel point IⁱDepth value of uⁱ,νⁱIs IⁱAbscissa and ordinate in the image;

wherein, | R^cIs at the super-pixel R^cThe number of pixels contained in (a).