CN108090485A

CN108090485A - Display foreground extraction method based on various visual angles fusion

Info

Publication number: CN108090485A
Application number: CN201711216652.9A
Authority: CN
Inventors: 王敏; 马宏斌; 侯本栋
Original assignee: Xidian University; Kunshan Innovation Institute of Xidian University
Current assignee: Xidian University; Kunshan Innovation Institute of Xidian University
Priority date: 2017-11-28
Filing date: 2017-11-28
Publication date: 2018-05-29

Abstract

The invention discloses a kind of display foreground extraction methods based on various visual angles fusion, mainly solve the problems, such as existing cumbersome inaccurate with extraction foreground edge based on technology extraction process.Its implementation is：First SVM classifier is trained, then obtains the gray level image of image to be extracted；The subgraph for the prospect that includes is detected in gray level image by trained SVM classifier；Using position coordinates of the subgraph in image to be extracted as the input of GrabCut algorithms, foreground extraction is carried out to image to be extracted, obtains the extraction result under the pixel visual angle of image to be extracted；With SLIC algorithms to the image under image to be extracted generation super-pixel visual angle；Extraction result under image under super-pixel visual angle and pixel visual angle is merged, obtains display foreground extraction result to be extracted.This invention simplifies foreground extraction processes, improve the efficiency and precision of extraction, are identified available for stereoscopic vision, image, semantic, three-dimensional reconstruction and picture search.

Description

Display foreground extraction method based on various visual angles fusion

Technical field

The invention belongs to technical field of image processing, further relate to a kind of display foreground based on various visual angles fusion certainly Dynamic extracting method, the present invention can be used for stereoscopic vision, image, semantic to identify, the application and research of picture search etc..

Background technology

Foreground extraction is a kind of means for extracting interesting target in the picture.It divide the image into several are specific, Region with unique properties simultaneously proposes the technology and process of interesting target, and has become from image procossing to image point The committed step of analysis.Specific explanations are to divide the image into several complementations according to features such as gray scale, color, texture and shapes to overlap Region, and these features is made to show similitude in the same area, and apparent difference is showed between different zones.Through The development and variation of decades have been crossed, foreground extraction has gradually formed the scientific system of oneself, and new extracting method emerges in an endless stream, Already become a field interdisciplinary, and cause the researcher of every field and the extensive concern of application personage, Such as medical domain, airborne and spaceborne RS field, industrial detection, security protection and military field etc..

Current foreground extracting method mainly includes the foreground extracting method based on threshold value, the foreground extraction side based on edge Method, the foreground extracting method based on region, the foreground extracting method based on figure cutting, the foreground extracting method based on energy functional With the display foreground extracting method based on deep learning etc..Foreground extracting method wherein based on figure cutting is because extraction accuracy Height, easy to operate and favored, the foreground extracting method based on figure cutting is a kind of combined optimization method based on graph theory, root According to the interactive information of user, piece image is mapped to a network by it, and establishes the energy function on label, with most The iteration that big stream minimal cut algorithm carries out network limited number of time is cut, and obtains the minimal cut of network, the prospect as image Extract result.But because the presence of human-computer interaction, when being extracted to multiple image, manual operation load is too big, limits it Application in engineering.For example, Meng Tang et al. 2013 are in 2013IEEE International Conference on It is delivered on Computer Vision《GrabCut in One Cut》, foreground area is selected by user, then by prospect institute Area maps for figure, by One Cut to mapping graph carry out limited number of time iteration cutting, obtain the foreground extraction of image as a result, But human-computer interaction calibration prospect region is needed, cause foreground extraction process comparatively laborious, and the energy of limited number of time changes Generation optimization can only obtain the minimal cut of more excellent solution, it is difficult to obtain accurate foreground edge.

The content of the invention

It is an object of the invention to be directed to the deficiency of above-mentioned prior art, it is proposed that a kind of image based on various visual angles fusion Prospect extraction method, for solving in the existing foreground extracting method based on figure cutting, because the presence of human-computer interaction is led The problem of foreground edge caused by the comparatively laborious iterative energy optimization with limited number of time of foreground extraction process of cause is inaccurate.

To achieve the above object, the technical solution that the present invention takes includes as follows：

(1) SVM classifier is trained, obtains trained SVM classifier；

(2) gray processing is carried out to image to be extracted, obtains gray level image；

(3) by trained SVM classifier, the subgraph p for including foreground target is detected in gray level image_k；

(3a) uses multiple dimensioned window, is slided, obtained by multiple line by line according to being spaced in gray level image for setting Image set P={ the p of subgraph composition₁,p₂,...p_k,...,p_q, wherein, k ∈ [1, q], p_kFor k-th of subgraph, q is subgraph The quantity of picture；

Each subgraph p in (3b) extraction image set P_kHistograms of oriented gradients HOG features, and be entered into and train SVM classifier in classify, subgraph p is calculated_kLabel l_pk；

(3c) judges subgraph p_kLabel l_pkWhether it is just, if so, subgraph p_kComprising foreground target, subgraph is recorded As p_kIn the position of image to be extracted, i.e. subgraph p_kThe pixel in the upper left corner is in the corresponding position (x of image to be extracted_min,y_min) and The pixel in the lower right corner is in the corresponding position (x of image to be extracted_max,y_max), step (4) is performed, otherwise, abandons image p_k；

(4) foreground extraction is carried out to image to be extracted：

Using subgraph p_kThe pixel in the upper left corner is in the corresponding position (x of image to be extracted_min,y_min) and the lower right corner pixel In the corresponding position (x of image to be extracted_max,y_max), the human-computer interaction of GrabCut algorithms is replaced, and is tied using replacing Fruit carries out foreground extraction to image to be extracted, obtains the extraction result S under the pixel visual angle of image to be extracted₁(x,y)；

(5) super-pixel of image to be extracted is calculated using simple linear Iterative Clustering SLIC, obtains super-pixel visual angle Under image：B={ b₁,b₂,...,b_i,...,b_m, i ∈ [1, m], b_iFor i-th of super-pixel, m is the quantity of super-pixel；

(6) to the extraction result S under the image B and the pixel visual angle of image to be extracted under super-pixel visual angle₁(x, y) is carried out Various visual angles merge, and obtain display foreground S to be extracted₂(x_i,y_i)。

Compared with prior art, the present invention it has the following advantages that：

1) present invention is using the subgraph where prospect in the SVM classifier acquisition image to be extracted of training, and uses son The rectangular area that the human-computer interaction that position coordinates of the image in image to be extracted replaces GrabCut algorithms obtains is defeated as algorithm Enter, realize the foreground extraction to image to be extracted, fully combine SVM classifier and GrabCut algorithms, figure can be automatically performed As foreground extraction process, solve in the existing foreground extracting method based on figure cutting, because caused by the presence of human-computer interaction The problem of foreground extraction process is comparatively laborious is effectively improved the efficiency of display foreground extraction.

2) present invention carries out super-pixel extraction using SLIC algorithms to image to be extracted, takes full advantage of one in super-pixel block The characteristics of cause property is preferable by being merged to the extraction result under the image under super-pixel visual angle and pixel visual angle, can obtain The prospect of image to be extracted accurately extraction as a result,

3) present invention makes foreground extraction result more accurate by introducing super-pixel, smoothly, solves and existing is cut based on figure In the foreground extracting method cut, because the problem of foreground edge caused by the iterative energy optimization of limited number of time is inaccurate, improves The precision of display foreground extraction.

Description of the drawings

Fig. 1 is the realization flow chart of the present invention；

Fig. 2 is that the sample image in the present invention assembles composition；

Fig. 3 is the schematic diagram that HOG features are extracted in the present invention；

Fig. 4 is to the visual presentation figure of HOG features in the present invention；

Fig. 5 be by the use of the present invention to pedestrian, leaf as prospect extraction experimental result picture.

Specific embodiment

Below in conjunction with the drawings and specific embodiments, the present invention is described in further detail.

With reference to Fig. 1, based on the display foreground extraction method of various visual angles fusion, comprise the following steps：

Step 1, SVM classifier is trained.

(1a) gathers the sample graph image set containing prospect classification, and carries out gray processing to all sample images therein, obtains To sample gray-scale map image set；

The structure chart of sample graph image set containing prospect classification is as shown in Fig. 2, it includes positive sample, negative sample and sample marks Sign file, wherein positive sample be the image comprising prospect, negative sample be the image not comprising prospect, sample label file for pair The classification and storage location of positive sample and negative sample illustrate；

All sample images concentrated to sample image carry out gray processing, are by the triple channel in sample image Red component R, green component G, blue component B are weighted averagely, obtain the gray level image gray value Gray of sample image：

Gray=R × 0.299+G × 0.587+B × 0.114；

(1b) extraction sample gray level image concentrates the histograms of oriented gradients HOG features of each image：

With reference to Fig. 3, this step is implemented as follows：

Input picture is divided by (1b1) connects several adjacent and nonoverlapping units, and pixel is calculated in each unit Gradient magnitude G (x, y) and gradient direction α (x, y), calculation formula be respectively：

Wherein, G_x(x, y)=H (x+1, y)-H (x-1, y) represents the horizontal direction at pixel (x, y) in input picture Gradient, G_yVertical gradient in (x, y)=H (x, y+1)-H (x, y-1) expression input pictures at pixel (x, y), H (x, Y) pixel value in input picture at pixel (x, y) is represented；

All gradient direction α (x, y) are divided into 9 angles, as the transverse axis of histogram, each angular range by (1b2) The longitudinal axis that corresponding Grad adds up as histogram, obtains histogram of gradients；

(1b3) counts the histogram of gradients of each unit, obtains the Feature Descriptor of each unit；

8 × 8 units are formed a block by (1b4), and the Feature Descriptor of all units, obtains the block in a block of connecting Histograms of oriented gradients HOG Feature Descriptors；

(1b5) connects all pieces in input picture of histograms of oriented gradients HOG Feature Descriptors, and it is defeated to obtain this Enter visual presentation figure such as Fig. 4 institutes of the histograms of oriented gradients HOG features of image, wherein histograms of oriented gradients HOG features Show, wherein Fig. 4 (a) be example master drawing, the histograms of oriented gradients HOG characteristic patterns of Fig. 4 (b) example master drawings, from fig. 4, it can be seen that side The presentation and shape of localized target are described well to histogram of gradients HOG features by gradient or edge direction density；

Sample gray level image is concentrated the histograms of oriented gradients HOG features of all input pictures to connect by (1b6), is obtained To the histograms of oriented gradients HOG feature sets of the sample gray-scale map image set；

(1c) uses histograms of oriented gradients HOG features all in sample orientation histogram of gradients HOG feature sets to SVM Grader is trained, and obtains trained SVM classifier.

Step 2, gray processing is carried out to image to be extracted, obtains gray level image.

By the red component R ' of the triple channel in image to be extracted, green component G ', blue component B ' threes are weighted It is average, obtain the gray value Gray ' of each pixel in image to be extracted：

Gray '=R ' × 0.299+G ' × 0.587+B ' × 0.114

The gray level image of image to be extracted is worth to according to the gray scale of each pixel.

Step 3, using trained SVM classifier, detected in the gray level image of image to be extracted comprising foreground target Subgraph p_i, obtain subgraph p_iThe pixel in the upper left corner is in the corresponding position (x of image to be extracted_min,y_min) and the lower right corner picture Element is in the corresponding position (x of image to be extracted_max,y_max)。

(3a) uses multiple dimensioned window, is slided line by line according to being spaced in the gray level image of image to be extracted for setting It is dynamic, obtain the image set being made of multiple subgraphs：P={ p₁,p₂,...p_k,...,p_q, wherein, p_kFor k-th of subgraph, q For the quantity of subgraph；

Each subgraph p in (3b) extraction image set P_kHistograms of oriented gradients HOG features, and be entered into and train SVM classifier in classify, obtain subgraph p_kLabel l_pk,l_pkCalculation formula be：

Wherein, _kFor k-th of normal vector of the hyperplane of SVM classifier, x_kFor subgraph 0_kHistograms of oriented gradients HOG features, k ∈ [1, q], q are subgraph number, and φ is the displacement item of the hyperplane of SVM classifier.

(3c) judges subgraph p_kLabel l_pkWhether it is just, if so, subgraph p_kComprising foreground target, subgraph is recorded As p_kIn the position of image to be extracted, i.e. subgraph p_kThe pixel in the upper left corner is in the corresponding position (x of image to be extracted_min,y_min) and The pixel in the lower right corner is in the corresponding position (x of image to be extracted_max,y_max), and with subgraph p_iThe pixel in the upper left corner is to be extracted Corresponding position (the x of image_min,y_min) and the lower right corner pixel image to be extracted corresponding position (x_max,y_max), composition includes The rectangular area of prospect performs step 4, otherwise, abandons subgraph p_k。

Step 4, foreground extraction is carried out to image to be extracted.

Foreground extracting method mainly includes the foreground extracting method based on threshold value, the foreground extracting method based on edge, base Foreground extracting method in region, the foreground extracting method based on figure cutting, foreground extracting method and base based on energy functional In display foreground extracting method of deep learning etc..This example is used but is not limited in the foreground extracting method based on figure cutting Foreground extraction is carried out based on GrabCut algorithms, specific implementation is：Using subgraph p_iThe pixel in the upper left corner and the lower right corner is being treated Extract the corresponding position (x of image_min,y_min), (x_max,y_max) and input of the image to be extracted as GrabCut algorithms, it treats and carries Image is taken to carry out foreground extraction, obtains the extraction result S under the pixel visual angle of image to be extracted₁(x,y)。

Step 5, the super-pixel of image to be extracted is calculated.

The super-pixel of image to be extracted is calculated, existing method includes the method based on graph theory and the side declined based on gradient Method；This example is used but is not limited to carry out super-pixel extraction based on SLIC algorithms in the method declined based on gradient；Specific step It is rapid as follows：

Image to be extracted is transformed into CIE-Lab color spaces by (5a) from RGB color, obtains CIE-Lab images；

The image to be extracted is transformed into CIE-Lab color spaces from RGB color, wherein between RGB and LAB There is no direct conversion formula, must be by the use of XYZ color space as interlayer, conversion formula is：

Wherein, r, g, b are three channel components of image pixel to be extracted, and R, G, B is r, the corrected function of g, b Three channel components after gamma (t) corrections,

The X of XYZ color space, Y, Z tri- passages point are obtained by the conversion formula of RGB color to XYZ color space Amount；Its conversion formula is：

Wherein,For the matrix of one 3 × 3；

In CIE-Lab color spaces, CIE-Lab is obtained to CIE-Lab color space conversion formula by XYZ color space The value of color space L, a, b triple channel, conversion formula are：

L=116f (Y/Y_n)

B=500 (f (X/X_n)-f(Y/Y_n))

A=200 (f (Y/Y_n)-f(Z/Z_n))

Wherein, X, Y, Z are RGB color to the transformed triple channel component of XYZ color space；X_n, Y_n, Z_nValue point It Wei 95.047,100.0,108.883；f(X/X_n), f (Y/Y_n), f (Z/Z_n) calculated by such as minor function：

(5b) initializes the cluster centre of super-pixel：Super-pixel number m=200 is set, according to super in CIE-Lab images Number of pixels uniformly distributes super-pixel cluster centre, obtains cluster centre collectionWherein,For the ith cluster center after the d times iteration, common m is a, wherein, l_i,a_i,b_iFor CIE-Lab face Three passages of the colour space, (x_i,y_i) it is b_iCoordinate；

(5c) sets label l (pixel)=- 1 and distance d (pixel) to each pixel p ixel of CIE-Lab images =∞；

(5d) calculates cluster centre collection C respectively^dMiddle cluster centre3 × 3 neighborhoods in all pixels point Grad, and Cluster centre is moved on on the pixel of the field inside gradient minimum, obtain new cluster centre collection C^d+1；

(5e) is for cluster centre collection C^dIn each cluster centre2S × 2S in it is every One pixel p ixel=[l_p,a_p,b_p,x_p,y_p], it calculatesWith the distance D (pixel) of pixel：

WhereinColor distortion between pixel,

Space length between pixel,

M is d_cMaximum,N is image all pixels point number, and m is the super-pixel number of setting；

(5f) compares the size of d (pixel) and D (pixel), if D (pixel) ＜ d (pixel), by D (pixel) D (pixel) is assigned to, if d (pixel)=D (pixel), i.e., record the pixel to cluster centre with d (pixel)Distance, And with pixel tag l (pixel) pixel being marked to belong to i-th of super-pixel, l (pixel)=i obtains new super-pixel b_i；

(5g) repeats step (5d)~(5f), updates cluster centre, until residual error convergence, obtains super-pixel figure As B={ b₁,b₂,...,b_i,...,b_m}。

Step 6, to the extraction result S under the pixel visual angle of the image under super-pixel visual angle and image to be extracted₁(x, y) into Row various visual angles merge, and obtain display foreground S to be extracted₂(x_i,y_i)。

(6a) is to the super-pixel b of the image B under super-pixel visual angle_iComprising extraction knot of all pixels under pixel visual angle Fruit S₁Label l in (x, y)_ijIt is weighted, obtains super-pixel b_iLabel confidence level Score_bi：

Score_bi=∑ l_ij；

(6b) sets confidence threshold value gate, by confidence threshold value gate and super-pixel b_iLabel confidence level Score_biInto Row compares, and obtains super-pixel b_iLabel l under visual angle_bi, and by label l_biAs pixel (x_i,y_i) label S₂(x_i, y_i), super-pixel b_iThe label of interior all pixels is identical with the label of super-pixel, S₂(x_i,y_i) it is display foreground to be extracted, In (x_i,y_i)∈b_i；

The setting confidence level gate is smaller, then super-pixel b_iIt is judged to that the probability of prospect is smaller, and gate is bigger, then super picture Plain b_iIt is bigger to be judged to the probability of prospect, but when gate is excessive, excessive noise is had in foreground extraction result and is existed；

It is described by confidence threshold value gate and super-pixel b_iLabel confidence level Score_biIt is compared, obtains super-pixel b_iLabel l_bi, comparison formula is：

Wherein, l_biFor super-pixel b_iLabel, num_biFor super-pixel b_iThe quantity of middle pixel, gate are confidence threshold value, 1 It is background label for prospect label, 0；

Described various visual angles fusion, refer to by the extraction result S under the image B and pixel visual angle under super-pixel visual angle₁(x, Y) merged.

It is tested below by way of foreground extraction, the technique effect of the present invention is described further：

1st, experiment condition and content

The experiment of the present invention respectively extracts pedestrian, leaf target, pedestrian that training data is looked at random for network, tree Leaf image set, amount of images are respectively 736,186, take positive negative sample respectively to every width picture, then make label, respectively Form sample graph image set, the sample graph image set of the classification containing leaf of the classification containing pedestrian.

By being programmed in MATLAB R2017a, realize and foreground extraction is carried out to pedestrian, leaf target, as a result such as Fig. 5 institutes Show.

2nd, analysis of experimental results：

From figure 5 it can be seen that the 4 width images that two class data are tested respectively, noise is not present in the foreground extraction result of output, And the foreground edge extracted is preferable, such as the foreground extraction to 4 width images of leaf classification as a result, the foreground edge extracted extremely Accurately.There are preferable compatibility, such as input picture 3 to pedestrian's classification for the prospect integrity degree contained in input picture, with Bust still can obtain preferable foreground extraction effect as input picture.

In addition, from figure 5 it can be seen that the present invention can be automatically performed after SVM classifier completes training to image to be extracted Prospect automated procedure obtains the foreground extraction of image to be extracted as a result, solving the existing foreground extracting method based on figure cutting In, it is necessary to the problem of human-computer interaction assisted extraction, while the present invention takes full advantage of the characteristics of uniformity is preferable in super-pixel block, The edge of extraction result under the pixel visual angle of GrabCut algorithms output is repaired, makes foreground extraction result more accurate, Smoothly, accurate foreground extraction is obtained as a result, improving foreground extraction precision.

Claims

1. a kind of display foreground extraction method based on various visual angles fusion, it is characterised in that：

(1) SVM classifier is trained, obtains trained SVM classifier；

(3a) uses multiple dimensioned window, is slided, obtained by multiple subgraphs line by line according to being spaced in gray level image for setting As the image set P={ p of composition₁,p₂,...p_k,...,p_q, wherein, k ∈ [1, q], p_kFor k-th of subgraph, q is subgraph Quantity；

Each subgraph p in (3b) extraction image set P_kHistograms of oriented gradients HOG features, and be entered into trained SVM Classify in grader, subgraph p is calculated_kLabel l_pk；

(3c) judges subgraph p_kLabel l_pkWhether it is just, if so, subgraph p_kInclude foreground target, record subgraph p_k In the position of image to be extracted, i.e. subgraph p_kThe pixel in the upper left corner is in the corresponding position (x of image to be extracted_min,y_min) and it is right The pixel of inferior horn is in the corresponding position (x of image to be extracted_max,y_max), step (4) is performed, otherwise, abandons image p_k；

(4) foreground extraction is carried out to image to be extracted：

Using subgraph p_kThe pixel in the upper left corner is in the corresponding position (x of image to be extracted_min,y_min) and the pixel in the lower right corner treating Extract the corresponding position (x of image_max,y_max), the human-computer interaction of GrabCut algorithms is replaced, and utilizes and replaces result pair Image to be extracted carries out foreground extraction, obtains the extraction result S under the pixel visual angle of image to be extracted₁(x,y)；

(5) super-pixel of image to be extracted is calculated using simple linear Iterative Clustering SLIC, is obtained under super-pixel visual angle Image：B={ b₁,b₂,...,b_i,...,b_m, i ∈ [1, m], b_iFor i-th of super-pixel, m is the quantity of super-pixel；

(6) to the extraction result S under the image B and the pixel visual angle of image to be extracted under super-pixel visual angle₁(x, y) is regarded more Angle is merged, and obtains display foreground S to be extracted₂(x_i,y_i)。

2. according to the method described in claim 1, be wherein trained in step (1) to SVM classifier, as follows into Row：

(1a) gathers the sample graph image set containing prospect classification, and carries out gray processing to all sample images therein, obtains sample This gray-scale map image set；

(1b) extraction sample gray level image concentrates the histograms of oriented gradients HOG features of each image, and it is straight to obtain sample orientation gradient Side's figure HOG feature sets；

(1c) uses histograms of oriented gradients HOG features all in sample orientation histogram of gradients HOG feature sets to svm classifier Device is trained, and obtains trained SVM classifier.

3. according to the method described in claim 2, it is characterized in that：All sample graphs concentrated in step (1a) to sample image It is by the triple channel red component R in sample image as carrying out gray processing, green component G, blue component B, which are weighted, to be averaged, Obtain the gray value Gray of sample gray level image：

Gray=R × 0.299+G × 0.587+B × 0.114.

4. according to the method described in claim 2, it is characterized in that：Extraction sample gray level image concentrates each figure in step (1b) The HOG features of picture, carry out in accordance with the following steps：

Input picture is divided by (1b1) connects several adjacent and nonoverlapping units, and the ladder of pixel is calculated in each unit Spend amplitude G (x, y) and gradient direction α (x, y)：

<mrow> <mi>&alpha;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mi>tan</mi> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>G</mi> <mi>x</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>G</mi> <mi>y</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>)</mo> </mrow> </mrow>

Wherein, G_x(x, y)=H (x+1, y)-H (x-1, y), G_y(x, y)=H (x, y+1)-H (x, y-1) represents input picture respectively Horizontal direction gradient and vertical gradient at middle pixel (x, y), H (x+1, y) represent input picture in pixel (x+1, Y) pixel value at place, H (x-1, y) represent the pixel value at pixel (x-1, y) in input picture, and H (x, y+1) represents input figure Pixel value as at pixel (x, y+1), H (x, y-1) represent the pixel value at pixel (x, y-1) in input picture；

All gradient direction α (x, y) are divided into 9 angles by (1b2), and as the transverse axis of histogram, each angular range institute is right The longitudinal axis that the Grad answered adds up as histogram, obtains histogram of gradients；

N × n unit is formed a block by (1b4), and the Feature Descriptor of all units, obtains the block in a block of connecting HOG Feature Descriptors；

All pieces of histograms of oriented gradients HOG Feature Descriptors, obtain the side of the input picture in (1b5) series connection input picture To histogram of gradients HOG features；

(1b6) series connection sample gray level image concentrates the histograms of oriented gradients HOG features of all input pictures, obtains sample ash Spend the histograms of oriented gradients HOG feature sets of image set.

5. according to the method described in claim 1, it is characterized in that：Step (3b) is fallen into a trap operator image p_kLabel l_pk, pass through The following formula carries out：

Wherein,For k-th of normal vector of the hyperplane of SVM classifier, x_kFor subgraph p_kHistograms of oriented gradients HOG it is special Sign, k ∈ [1, q], q are subgraph number, and φ is the displacement item of the hyperplane of SVM classifier.

6. according to the method described in claim 1, it is characterized in that：The super-pixel of image to be extracted is calculated in step (5), is realized Step is：

(5b) initializes the cluster centre of super-pixel：Super-pixel number is set, it is equal according to super-pixel number in CIE-Lab images Even distribution super-pixel cluster centre, obtains cluster centre collectionWherein,For the ith cluster center after the d times iteration, common m is a, wherein, l_i,a_i,b_iFor CIE-Lab face Three passages of the colour space, (x_i,y_i) it is b_iCoordinate；

(5c) to each pixel p ixel of CIE-Lab images, set label l (pixel)=- 1 and distance d (pixel)= ∞；

(5d) calculates cluster centre collection C respectively^dMiddle cluster centreN × n fields in all pixels point Grad, and will be poly- Class center is moved on on the pixel of the field inside gradient minimum, obtains new cluster centre collection C^d+1；

(5e) is for cluster centre collection C^dIn each cluster centre2S × 2S in each Pixel p ixel=[l_p,a_p,b_p,x_p,y_p], it calculatesWith the distance D (pixel) of pixel：

WhereinColor distortion between pixel,

Space length between pixel,

(5f) compares the size of d (pixel) and D (pixel), if D (pixel) ＜ d (pixel), D (pixel) is assigned to d (pixel), l (pixel)=i obtains new super-pixel b_i；

(5g) constantly performs step (5d)~(5f), updates cluster centre, until residual error convergence, obtains super-pixel image B ={ b₁,b₂,...,b_i,...,b_m}。

7. according to the method described in claim 1, wherein to the image B under super-pixel visual angle and image to be extracted in step (6) Pixel visual angle under extraction result S₁(x, y) carries out various visual angles fusion and carries out as follows：

(6a) is to the super-pixel b of super-pixel visual angle hypograph B_iComprising extraction result S of all pixels under pixel visual angle₁(x, Y) the label l in_ijIt is weighted, obtains super-pixel b_iLabel confidence level Score_bi；

(6b) sets confidence threshold value gate, by confidence threshold value gate and super-pixel b_iLabel confidence level Score_biCompared Compared with obtaining super-pixel b_iLabel l under visual angle_bi：

<mrow> <msub> <mi>l</mi> <mrow> <mi>b</mi> <mi>i</mi> </mrow> </msub> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <mrow> <mi>i</mi> <mi>f</mi> </mrow> </mtd> <mtd> <mrow> <msub> <mi>Score</mi> <mrow> <mi>b</mi> <mi>i</mi> </mrow> </msub> <mo>></mo> <msub> <mi>num</mi> <mrow> <mi>b</mi> <mi>i</mi> </mrow> </msub> <mo>/</mo> <mi>g</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mrow> <mi>i</mi> <mi>f</mi> </mrow> </mtd> <mtd> <mrow> <msub> <mi>Score</mi> <mrow> <mi>b</mi> <mi>i</mi> </mrow> </msub> <mo><</mo> <msub> <mi>num</mi> <mrow> <mi>b</mi> <mi>i</mi> </mrow> </msub> <mo>/</mo> <mi>g</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>

Wherein, num_biFor super-pixel b_iThe quantity of middle pixel, 1 is prospect label, and 0 is background label.

(6c) is by label l_biAs pixel (x_i,y_i) label S₂(x_i,y_i), the S₂(x_i,y_i) be image to be extracted before Scape, wherein (x_i,y_i)∈b_i。

8. according to the method described in claim 7, it is characterized in that：Super-pixel b in step (6a)_iLabel confidence level Score_bi, it is the super-pixel b to super-pixel visual angle hypograph B_iComprising extraction result S of all pixels under pixel visual angle₁ Label in (x, y) is summed, i.e.,：

Score_bi=∑ l_ij

Wherein, Score_biFor super-pixel b_iLabel confidence level.