CN108090485A - Display foreground extraction method based on various visual angles fusion - Google Patents

Display foreground extraction method based on various visual angles fusion Download PDF

Info

Publication number
CN108090485A
CN108090485A CN201711216652.9A CN201711216652A CN108090485A CN 108090485 A CN108090485 A CN 108090485A CN 201711216652 A CN201711216652 A CN 201711216652A CN 108090485 A CN108090485 A CN 108090485A
Authority
CN
China
Prior art keywords
pixel
mrow
image
super
extracted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711216652.9A
Other languages
Chinese (zh)
Inventor
王敏
马宏斌
侯本栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Kunshan Innovation Institute of Xidian University
Original Assignee
Xidian University
Kunshan Innovation Institute of Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University, Kunshan Innovation Institute of Xidian University filed Critical Xidian University
Priority to CN201711216652.9A priority Critical patent/CN108090485A/en
Publication of CN108090485A publication Critical patent/CN108090485A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of display foreground extraction methods based on various visual angles fusion, mainly solve the problems, such as existing cumbersome inaccurate with extraction foreground edge based on technology extraction process.Its implementation is:First SVM classifier is trained, then obtains the gray level image of image to be extracted;The subgraph for the prospect that includes is detected in gray level image by trained SVM classifier;Using position coordinates of the subgraph in image to be extracted as the input of GrabCut algorithms, foreground extraction is carried out to image to be extracted, obtains the extraction result under the pixel visual angle of image to be extracted;With SLIC algorithms to the image under image to be extracted generation super-pixel visual angle;Extraction result under image under super-pixel visual angle and pixel visual angle is merged, obtains display foreground extraction result to be extracted.This invention simplifies foreground extraction processes, improve the efficiency and precision of extraction, are identified available for stereoscopic vision, image, semantic, three-dimensional reconstruction and picture search.

Description

Display foreground extraction method based on various visual angles fusion
Technical field
The invention belongs to technical field of image processing, further relate to a kind of display foreground based on various visual angles fusion certainly Dynamic extracting method, the present invention can be used for stereoscopic vision, image, semantic to identify, the application and research of picture search etc..
Background technology
Foreground extraction is a kind of means for extracting interesting target in the picture.It divide the image into several are specific, Region with unique properties simultaneously proposes the technology and process of interesting target, and has become from image procossing to image point The committed step of analysis.Specific explanations are to divide the image into several complementations according to features such as gray scale, color, texture and shapes to overlap Region, and these features is made to show similitude in the same area, and apparent difference is showed between different zones.Through The development and variation of decades have been crossed, foreground extraction has gradually formed the scientific system of oneself, and new extracting method emerges in an endless stream, Already become a field interdisciplinary, and cause the researcher of every field and the extensive concern of application personage, Such as medical domain, airborne and spaceborne RS field, industrial detection, security protection and military field etc..
Current foreground extracting method mainly includes the foreground extracting method based on threshold value, the foreground extraction side based on edge Method, the foreground extracting method based on region, the foreground extracting method based on figure cutting, the foreground extracting method based on energy functional With the display foreground extracting method based on deep learning etc..Foreground extracting method wherein based on figure cutting is because extraction accuracy Height, easy to operate and favored, the foreground extracting method based on figure cutting is a kind of combined optimization method based on graph theory, root According to the interactive information of user, piece image is mapped to a network by it, and establishes the energy function on label, with most The iteration that big stream minimal cut algorithm carries out network limited number of time is cut, and obtains the minimal cut of network, the prospect as image Extract result.But because the presence of human-computer interaction, when being extracted to multiple image, manual operation load is too big, limits it Application in engineering.For example, Meng Tang et al. 2013 are in 2013IEEE International Conference on It is delivered on Computer Vision《GrabCut in One Cut》, foreground area is selected by user, then by prospect institute Area maps for figure, by One Cut to mapping graph carry out limited number of time iteration cutting, obtain the foreground extraction of image as a result, But human-computer interaction calibration prospect region is needed, cause foreground extraction process comparatively laborious, and the energy of limited number of time changes Generation optimization can only obtain the minimal cut of more excellent solution, it is difficult to obtain accurate foreground edge.
The content of the invention
It is an object of the invention to be directed to the deficiency of above-mentioned prior art, it is proposed that a kind of image based on various visual angles fusion Prospect extraction method, for solving in the existing foreground extracting method based on figure cutting, because the presence of human-computer interaction is led The problem of foreground edge caused by the comparatively laborious iterative energy optimization with limited number of time of foreground extraction process of cause is inaccurate.
To achieve the above object, the technical solution that the present invention takes includes as follows:
(1) SVM classifier is trained, obtains trained SVM classifier;
(2) gray processing is carried out to image to be extracted, obtains gray level image;
(3) by trained SVM classifier, the subgraph p for including foreground target is detected in gray level imagek
(3a) uses multiple dimensioned window, is slided, obtained by multiple line by line according to being spaced in gray level image for setting Image set P={ the p of subgraph composition1,p2,...pk,...,pq, wherein, k ∈ [1, q], pkFor k-th of subgraph, q is subgraph The quantity of picture;
Each subgraph p in (3b) extraction image set PkHistograms of oriented gradients HOG features, and be entered into and train SVM classifier in classify, subgraph p is calculatedkLabel lpk
(3c) judges subgraph pkLabel lpkWhether it is just, if so, subgraph pkComprising foreground target, subgraph is recorded As pkIn the position of image to be extracted, i.e. subgraph pkThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and The pixel in the lower right corner is in the corresponding position (x of image to be extractedmax,ymax), step (4) is performed, otherwise, abandons image pk
(4) foreground extraction is carried out to image to be extracted:
Using subgraph pkThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and the lower right corner pixel In the corresponding position (x of image to be extractedmax,ymax), the human-computer interaction of GrabCut algorithms is replaced, and is tied using replacing Fruit carries out foreground extraction to image to be extracted, obtains the extraction result S under the pixel visual angle of image to be extracted1(x,y);
(5) super-pixel of image to be extracted is calculated using simple linear Iterative Clustering SLIC, obtains super-pixel visual angle Under image:B={ b1,b2,...,bi,...,bm, i ∈ [1, m], biFor i-th of super-pixel, m is the quantity of super-pixel;
(6) to the extraction result S under the image B and the pixel visual angle of image to be extracted under super-pixel visual angle1(x, y) is carried out Various visual angles merge, and obtain display foreground S to be extracted2(xi,yi)。
Compared with prior art, the present invention it has the following advantages that:
1) present invention is using the subgraph where prospect in the SVM classifier acquisition image to be extracted of training, and uses son The rectangular area that the human-computer interaction that position coordinates of the image in image to be extracted replaces GrabCut algorithms obtains is defeated as algorithm Enter, realize the foreground extraction to image to be extracted, fully combine SVM classifier and GrabCut algorithms, figure can be automatically performed As foreground extraction process, solve in the existing foreground extracting method based on figure cutting, because caused by the presence of human-computer interaction The problem of foreground extraction process is comparatively laborious is effectively improved the efficiency of display foreground extraction.
2) present invention carries out super-pixel extraction using SLIC algorithms to image to be extracted, takes full advantage of one in super-pixel block The characteristics of cause property is preferable by being merged to the extraction result under the image under super-pixel visual angle and pixel visual angle, can obtain The prospect of image to be extracted accurately extraction as a result,
3) present invention makes foreground extraction result more accurate by introducing super-pixel, smoothly, solves and existing is cut based on figure In the foreground extracting method cut, because the problem of foreground edge caused by the iterative energy optimization of limited number of time is inaccurate, improves The precision of display foreground extraction.
Description of the drawings
Fig. 1 is the realization flow chart of the present invention;
Fig. 2 is that the sample image in the present invention assembles composition;
Fig. 3 is the schematic diagram that HOG features are extracted in the present invention;
Fig. 4 is to the visual presentation figure of HOG features in the present invention;
Fig. 5 be by the use of the present invention to pedestrian, leaf as prospect extraction experimental result picture.
Specific embodiment
Below in conjunction with the drawings and specific embodiments, the present invention is described in further detail.
With reference to Fig. 1, based on the display foreground extraction method of various visual angles fusion, comprise the following steps:
Step 1, SVM classifier is trained.
(1a) gathers the sample graph image set containing prospect classification, and carries out gray processing to all sample images therein, obtains To sample gray-scale map image set;
The structure chart of sample graph image set containing prospect classification is as shown in Fig. 2, it includes positive sample, negative sample and sample marks Sign file, wherein positive sample be the image comprising prospect, negative sample be the image not comprising prospect, sample label file for pair The classification and storage location of positive sample and negative sample illustrate;
All sample images concentrated to sample image carry out gray processing, are by the triple channel in sample image Red component R, green component G, blue component B are weighted averagely, obtain the gray level image gray value Gray of sample image:
Gray=R × 0.299+G × 0.587+B × 0.114;
(1b) extraction sample gray level image concentrates the histograms of oriented gradients HOG features of each image:
With reference to Fig. 3, this step is implemented as follows:
Input picture is divided by (1b1) connects several adjacent and nonoverlapping units, and pixel is calculated in each unit Gradient magnitude G (x, y) and gradient direction α (x, y), calculation formula be respectively:
Wherein, Gx(x, y)=H (x+1, y)-H (x-1, y) represents the horizontal direction at pixel (x, y) in input picture Gradient, GyVertical gradient in (x, y)=H (x, y+1)-H (x, y-1) expression input pictures at pixel (x, y), H (x, Y) pixel value in input picture at pixel (x, y) is represented;
All gradient direction α (x, y) are divided into 9 angles, as the transverse axis of histogram, each angular range by (1b2) The longitudinal axis that corresponding Grad adds up as histogram, obtains histogram of gradients;
(1b3) counts the histogram of gradients of each unit, obtains the Feature Descriptor of each unit;
8 × 8 units are formed a block by (1b4), and the Feature Descriptor of all units, obtains the block in a block of connecting Histograms of oriented gradients HOG Feature Descriptors;
(1b5) connects all pieces in input picture of histograms of oriented gradients HOG Feature Descriptors, and it is defeated to obtain this Enter visual presentation figure such as Fig. 4 institutes of the histograms of oriented gradients HOG features of image, wherein histograms of oriented gradients HOG features Show, wherein Fig. 4 (a) be example master drawing, the histograms of oriented gradients HOG characteristic patterns of Fig. 4 (b) example master drawings, from fig. 4, it can be seen that side The presentation and shape of localized target are described well to histogram of gradients HOG features by gradient or edge direction density;
Sample gray level image is concentrated the histograms of oriented gradients HOG features of all input pictures to connect by (1b6), is obtained To the histograms of oriented gradients HOG feature sets of the sample gray-scale map image set;
(1c) uses histograms of oriented gradients HOG features all in sample orientation histogram of gradients HOG feature sets to SVM Grader is trained, and obtains trained SVM classifier.
Step 2, gray processing is carried out to image to be extracted, obtains gray level image.
By the red component R ' of the triple channel in image to be extracted, green component G ', blue component B ' threes are weighted It is average, obtain the gray value Gray ' of each pixel in image to be extracted:
Gray '=R ' × 0.299+G ' × 0.587+B ' × 0.114
The gray level image of image to be extracted is worth to according to the gray scale of each pixel.
Step 3, using trained SVM classifier, detected in the gray level image of image to be extracted comprising foreground target Subgraph pi, obtain subgraph piThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and the lower right corner picture Element is in the corresponding position (x of image to be extractedmax,ymax)。
(3a) uses multiple dimensioned window, is slided line by line according to being spaced in the gray level image of image to be extracted for setting It is dynamic, obtain the image set being made of multiple subgraphs:P={ p1,p2,...pk,...,pq, wherein, pkFor k-th of subgraph, q For the quantity of subgraph;
Each subgraph p in (3b) extraction image set PkHistograms of oriented gradients HOG features, and be entered into and train SVM classifier in classify, obtain subgraph pkLabel lpk,lpkCalculation formula be:
Wherein, kFor k-th of normal vector of the hyperplane of SVM classifier, xkFor subgraph 0kHistograms of oriented gradients HOG features, k ∈ [1, q], q are subgraph number, and φ is the displacement item of the hyperplane of SVM classifier.
(3c) judges subgraph pkLabel lpkWhether it is just, if so, subgraph pkComprising foreground target, subgraph is recorded As pkIn the position of image to be extracted, i.e. subgraph pkThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and The pixel in the lower right corner is in the corresponding position (x of image to be extractedmax,ymax), and with subgraph piThe pixel in the upper left corner is to be extracted Corresponding position (the x of imagemin,ymin) and the lower right corner pixel image to be extracted corresponding position (xmax,ymax), composition includes The rectangular area of prospect performs step 4, otherwise, abandons subgraph pk
Step 4, foreground extraction is carried out to image to be extracted.
Foreground extracting method mainly includes the foreground extracting method based on threshold value, the foreground extracting method based on edge, base Foreground extracting method in region, the foreground extracting method based on figure cutting, foreground extracting method and base based on energy functional In display foreground extracting method of deep learning etc..This example is used but is not limited in the foreground extracting method based on figure cutting Foreground extraction is carried out based on GrabCut algorithms, specific implementation is:Using subgraph piThe pixel in the upper left corner and the lower right corner is being treated Extract the corresponding position (x of imagemin,ymin), (xmax,ymax) and input of the image to be extracted as GrabCut algorithms, it treats and carries Image is taken to carry out foreground extraction, obtains the extraction result S under the pixel visual angle of image to be extracted1(x,y)。
Step 5, the super-pixel of image to be extracted is calculated.
The super-pixel of image to be extracted is calculated, existing method includes the method based on graph theory and the side declined based on gradient Method;This example is used but is not limited to carry out super-pixel extraction based on SLIC algorithms in the method declined based on gradient;Specific step It is rapid as follows:
Image to be extracted is transformed into CIE-Lab color spaces by (5a) from RGB color, obtains CIE-Lab images;
The image to be extracted is transformed into CIE-Lab color spaces from RGB color, wherein between RGB and LAB There is no direct conversion formula, must be by the use of XYZ color space as interlayer, conversion formula is:
Wherein, r, g, b are three channel components of image pixel to be extracted, and R, G, B is r, the corrected function of g, b Three channel components after gamma (t) corrections,
The X of XYZ color space, Y, Z tri- passages point are obtained by the conversion formula of RGB color to XYZ color space Amount;Its conversion formula is:
Wherein,For the matrix of one 3 × 3;
In CIE-Lab color spaces, CIE-Lab is obtained to CIE-Lab color space conversion formula by XYZ color space The value of color space L, a, b triple channel, conversion formula are:
L=116f (Y/Yn)
B=500 (f (X/Xn)-f(Y/Yn))
A=200 (f (Y/Yn)-f(Z/Zn))
Wherein, X, Y, Z are RGB color to the transformed triple channel component of XYZ color space;Xn, Yn, ZnValue point It Wei 95.047,100.0,108.883;f(X/Xn), f (Y/Yn), f (Z/Zn) calculated by such as minor function:
(5b) initializes the cluster centre of super-pixel:Super-pixel number m=200 is set, according to super in CIE-Lab images Number of pixels uniformly distributes super-pixel cluster centre, obtains cluster centre collectionWherein,For the ith cluster center after the d times iteration, common m is a, wherein, li,ai,biFor CIE-Lab face Three passages of the colour space, (xi,yi) it is biCoordinate;
(5c) sets label l (pixel)=- 1 and distance d (pixel) to each pixel p ixel of CIE-Lab images =∞;
(5d) calculates cluster centre collection C respectivelydMiddle cluster centre3 × 3 neighborhoods in all pixels point Grad, and Cluster centre is moved on on the pixel of the field inside gradient minimum, obtain new cluster centre collection Cd+1
(5e) is for cluster centre collection CdIn each cluster centre2S × 2S in it is every One pixel p ixel=[lp,ap,bp,xp,yp], it calculatesWith the distance D (pixel) of pixel:
WhereinColor distortion between pixel,
Space length between pixel,
M is dcMaximum,N is image all pixels point number, and m is the super-pixel number of setting;
(5f) compares the size of d (pixel) and D (pixel), if D (pixel) < d (pixel), by D (pixel) D (pixel) is assigned to, if d (pixel)=D (pixel), i.e., record the pixel to cluster centre with d (pixel)Distance, And with pixel tag l (pixel) pixel being marked to belong to i-th of super-pixel, l (pixel)=i obtains new super-pixel bi
(5g) repeats step (5d)~(5f), updates cluster centre, until residual error convergence, obtains super-pixel figure As B={ b1,b2,...,bi,...,bm}。
Step 6, to the extraction result S under the pixel visual angle of the image under super-pixel visual angle and image to be extracted1(x, y) into Row various visual angles merge, and obtain display foreground S to be extracted2(xi,yi)。
(6a) is to the super-pixel b of the image B under super-pixel visual angleiComprising extraction knot of all pixels under pixel visual angle Fruit S1Label l in (x, y)ijIt is weighted, obtains super-pixel biLabel confidence level Scorebi
Scorebi=∑ lij
(6b) sets confidence threshold value gate, by confidence threshold value gate and super-pixel biLabel confidence level ScorebiInto Row compares, and obtains super-pixel biLabel l under visual anglebi, and by label lbiAs pixel (xi,yi) label S2(xi, yi), super-pixel biThe label of interior all pixels is identical with the label of super-pixel, S2(xi,yi) it is display foreground to be extracted, In (xi,yi)∈bi
The setting confidence level gate is smaller, then super-pixel biIt is judged to that the probability of prospect is smaller, and gate is bigger, then super picture Plain biIt is bigger to be judged to the probability of prospect, but when gate is excessive, excessive noise is had in foreground extraction result and is existed;
It is described by confidence threshold value gate and super-pixel biLabel confidence level ScorebiIt is compared, obtains super-pixel biLabel lbi, comparison formula is:
Wherein, lbiFor super-pixel biLabel, numbiFor super-pixel biThe quantity of middle pixel, gate are confidence threshold value, 1 It is background label for prospect label, 0;
Described various visual angles fusion, refer to by the extraction result S under the image B and pixel visual angle under super-pixel visual angle1(x, Y) merged.
It is tested below by way of foreground extraction, the technique effect of the present invention is described further:
1st, experiment condition and content
The experiment of the present invention respectively extracts pedestrian, leaf target, pedestrian that training data is looked at random for network, tree Leaf image set, amount of images are respectively 736,186, take positive negative sample respectively to every width picture, then make label, respectively Form sample graph image set, the sample graph image set of the classification containing leaf of the classification containing pedestrian.
By being programmed in MATLAB R2017a, realize and foreground extraction is carried out to pedestrian, leaf target, as a result such as Fig. 5 institutes Show.
2nd, analysis of experimental results:
From figure 5 it can be seen that the 4 width images that two class data are tested respectively, noise is not present in the foreground extraction result of output, And the foreground edge extracted is preferable, such as the foreground extraction to 4 width images of leaf classification as a result, the foreground edge extracted extremely Accurately.There are preferable compatibility, such as input picture 3 to pedestrian's classification for the prospect integrity degree contained in input picture, with Bust still can obtain preferable foreground extraction effect as input picture.
In addition, from figure 5 it can be seen that the present invention can be automatically performed after SVM classifier completes training to image to be extracted Prospect automated procedure obtains the foreground extraction of image to be extracted as a result, solving the existing foreground extracting method based on figure cutting In, it is necessary to the problem of human-computer interaction assisted extraction, while the present invention takes full advantage of the characteristics of uniformity is preferable in super-pixel block, The edge of extraction result under the pixel visual angle of GrabCut algorithms output is repaired, makes foreground extraction result more accurate, Smoothly, accurate foreground extraction is obtained as a result, improving foreground extraction precision.

Claims (8)

1. a kind of display foreground extraction method based on various visual angles fusion, it is characterised in that:
(1) SVM classifier is trained, obtains trained SVM classifier;
(2) gray processing is carried out to image to be extracted, obtains gray level image;
(3) by trained SVM classifier, the subgraph p for including foreground target is detected in gray level imagek
(3a) uses multiple dimensioned window, is slided, obtained by multiple subgraphs line by line according to being spaced in gray level image for setting As the image set P={ p of composition1,p2,...pk,...,pq, wherein, k ∈ [1, q], pkFor k-th of subgraph, q is subgraph Quantity;
Each subgraph p in (3b) extraction image set PkHistograms of oriented gradients HOG features, and be entered into trained SVM Classify in grader, subgraph p is calculatedkLabel lpk
(3c) judges subgraph pkLabel lpkWhether it is just, if so, subgraph pkInclude foreground target, record subgraph pk In the position of image to be extracted, i.e. subgraph pkThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and it is right The pixel of inferior horn is in the corresponding position (x of image to be extractedmax,ymax), step (4) is performed, otherwise, abandons image pk
(4) foreground extraction is carried out to image to be extracted:
Using subgraph pkThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and the pixel in the lower right corner treating Extract the corresponding position (x of imagemax,ymax), the human-computer interaction of GrabCut algorithms is replaced, and utilizes and replaces result pair Image to be extracted carries out foreground extraction, obtains the extraction result S under the pixel visual angle of image to be extracted1(x,y);
(5) super-pixel of image to be extracted is calculated using simple linear Iterative Clustering SLIC, is obtained under super-pixel visual angle Image:B={ b1,b2,...,bi,...,bm, i ∈ [1, m], biFor i-th of super-pixel, m is the quantity of super-pixel;
(6) to the extraction result S under the image B and the pixel visual angle of image to be extracted under super-pixel visual angle1(x, y) is regarded more Angle is merged, and obtains display foreground S to be extracted2(xi,yi)。
2. according to the method described in claim 1, be wherein trained in step (1) to SVM classifier, as follows into Row:
(1a) gathers the sample graph image set containing prospect classification, and carries out gray processing to all sample images therein, obtains sample This gray-scale map image set;
(1b) extraction sample gray level image concentrates the histograms of oriented gradients HOG features of each image, and it is straight to obtain sample orientation gradient Side's figure HOG feature sets;
(1c) uses histograms of oriented gradients HOG features all in sample orientation histogram of gradients HOG feature sets to svm classifier Device is trained, and obtains trained SVM classifier.
3. according to the method described in claim 2, it is characterized in that:All sample graphs concentrated in step (1a) to sample image It is by the triple channel red component R in sample image as carrying out gray processing, green component G, blue component B, which are weighted, to be averaged, Obtain the gray value Gray of sample gray level image:
Gray=R × 0.299+G × 0.587+B × 0.114.
4. according to the method described in claim 2, it is characterized in that:Extraction sample gray level image concentrates each figure in step (1b) The HOG features of picture, carry out in accordance with the following steps:
Input picture is divided by (1b1) connects several adjacent and nonoverlapping units, and the ladder of pixel is calculated in each unit Spend amplitude G (x, y) and gradient direction α (x, y):
<mrow> <mi>G</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <msqrt> <mrow> <msub> <mi>G</mi> <mi>x</mi> </msub> <msup> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>+</mo> <msub> <mi>G</mi> <mi>y</mi> </msub> <msup> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> </msqrt> </mrow>
<mrow> <mi>&amp;alpha;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mi>tan</mi> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>G</mi> <mi>x</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>G</mi> <mi>y</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>)</mo> </mrow> </mrow>
Wherein, Gx(x, y)=H (x+1, y)-H (x-1, y), Gy(x, y)=H (x, y+1)-H (x, y-1) represents input picture respectively Horizontal direction gradient and vertical gradient at middle pixel (x, y), H (x+1, y) represent input picture in pixel (x+1, Y) pixel value at place, H (x-1, y) represent the pixel value at pixel (x-1, y) in input picture, and H (x, y+1) represents input figure Pixel value as at pixel (x, y+1), H (x, y-1) represent the pixel value at pixel (x, y-1) in input picture;
All gradient direction α (x, y) are divided into 9 angles by (1b2), and as the transverse axis of histogram, each angular range institute is right The longitudinal axis that the Grad answered adds up as histogram, obtains histogram of gradients;
(1b3) counts the histogram of gradients of each unit, obtains the Feature Descriptor of each unit;
N × n unit is formed a block by (1b4), and the Feature Descriptor of all units, obtains the block in a block of connecting HOG Feature Descriptors;
All pieces of histograms of oriented gradients HOG Feature Descriptors, obtain the side of the input picture in (1b5) series connection input picture To histogram of gradients HOG features;
(1b6) series connection sample gray level image concentrates the histograms of oriented gradients HOG features of all input pictures, obtains sample ash Spend the histograms of oriented gradients HOG feature sets of image set.
5. according to the method described in claim 1, it is characterized in that:Step (3b) is fallen into a trap operator image pkLabel lpk, pass through The following formula carries out:
Wherein,For k-th of normal vector of the hyperplane of SVM classifier, xkFor subgraph pkHistograms of oriented gradients HOG it is special Sign, k ∈ [1, q], q are subgraph number, and φ is the displacement item of the hyperplane of SVM classifier.
6. according to the method described in claim 1, it is characterized in that:The super-pixel of image to be extracted is calculated in step (5), is realized Step is:
Image to be extracted is transformed into CIE-Lab color spaces by (5a) from RGB color, obtains CIE-Lab images;
(5b) initializes the cluster centre of super-pixel:Super-pixel number is set, it is equal according to super-pixel number in CIE-Lab images Even distribution super-pixel cluster centre, obtains cluster centre collectionWherein,For the ith cluster center after the d times iteration, common m is a, wherein, li,ai,biFor CIE-Lab face Three passages of the colour space, (xi,yi) it is biCoordinate;
(5c) to each pixel p ixel of CIE-Lab images, set label l (pixel)=- 1 and distance d (pixel)= ∞;
(5d) calculates cluster centre collection C respectivelydMiddle cluster centreN × n fields in all pixels point Grad, and will be poly- Class center is moved on on the pixel of the field inside gradient minimum, obtains new cluster centre collection Cd+1
(5e) is for cluster centre collection CdIn each cluster centre2S × 2S in each Pixel p ixel=[lp,ap,bp,xp,yp], it calculatesWith the distance D (pixel) of pixel:
<mrow> <mi>D</mi> <mrow> <mo>(</mo> <mi>p</mi> <mi>i</mi> <mi>x</mi> <mi>e</mi> <mi>l</mi> <mo>)</mo> </mrow> <mo>=</mo> <msqrt> <mrow> <msubsup> <mi>d</mi> <mi>c</mi> <mn>2</mn> </msubsup> <mo>+</mo> <msup> <mrow> <mo>(</mo> <mfrac> <msub> <mi>d</mi> <mi>s</mi> </msub> <mi>S</mi> </mfrac> <mo>)</mo> </mrow> <mn>2</mn> </msup> <msup> <mi>M</mi> <mn>2</mn> </msup> </mrow> </msqrt> </mrow>
WhereinColor distortion between pixel,
Space length between pixel,
M is dcMaximum,N is image all pixels point number, and m is the super-pixel number of setting;
(5f) compares the size of d (pixel) and D (pixel), if D (pixel) < d (pixel), D (pixel) is assigned to d (pixel), l (pixel)=i obtains new super-pixel bi
(5g) constantly performs step (5d)~(5f), updates cluster centre, until residual error convergence, obtains super-pixel image B ={ b1,b2,...,bi,...,bm}。
7. according to the method described in claim 1, wherein to the image B under super-pixel visual angle and image to be extracted in step (6) Pixel visual angle under extraction result S1(x, y) carries out various visual angles fusion and carries out as follows:
(6a) is to the super-pixel b of super-pixel visual angle hypograph BiComprising extraction result S of all pixels under pixel visual angle1(x, Y) the label l inijIt is weighted, obtains super-pixel biLabel confidence level Scorebi
(6b) sets confidence threshold value gate, by confidence threshold value gate and super-pixel biLabel confidence level ScorebiCompared Compared with obtaining super-pixel biLabel l under visual anglebi
<mrow> <msub> <mi>l</mi> <mrow> <mi>b</mi> <mi>i</mi> </mrow> </msub> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <mrow> <mi>i</mi> <mi>f</mi> </mrow> </mtd> <mtd> <mrow> <msub> <mi>Score</mi> <mrow> <mi>b</mi> <mi>i</mi> </mrow> </msub> <mo>&gt;</mo> <msub> <mi>num</mi> <mrow> <mi>b</mi> <mi>i</mi> </mrow> </msub> <mo>/</mo> <mi>g</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mrow> <mi>i</mi> <mi>f</mi> </mrow> </mtd> <mtd> <mrow> <msub> <mi>Score</mi> <mrow> <mi>b</mi> <mi>i</mi> </mrow> </msub> <mo>&lt;</mo> <msub> <mi>num</mi> <mrow> <mi>b</mi> <mi>i</mi> </mrow> </msub> <mo>/</mo> <mi>g</mi> <mi>a</mi> <mi>t</mi> <mi>e</mi> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>
Wherein, numbiFor super-pixel biThe quantity of middle pixel, 1 is prospect label, and 0 is background label.
(6c) is by label lbiAs pixel (xi,yi) label S2(xi,yi), the S2(xi,yi) be image to be extracted before Scape, wherein (xi,yi)∈bi
8. according to the method described in claim 7, it is characterized in that:Super-pixel b in step (6a)iLabel confidence level Scorebi, it is the super-pixel b to super-pixel visual angle hypograph BiComprising extraction result S of all pixels under pixel visual angle1 Label in (x, y) is summed, i.e.,:
Scorebi=∑ lij
Wherein, ScorebiFor super-pixel biLabel confidence level.
CN201711216652.9A 2017-11-28 2017-11-28 Display foreground extraction method based on various visual angles fusion Pending CN108090485A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711216652.9A CN108090485A (en) 2017-11-28 2017-11-28 Display foreground extraction method based on various visual angles fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711216652.9A CN108090485A (en) 2017-11-28 2017-11-28 Display foreground extraction method based on various visual angles fusion

Publications (1)

Publication Number Publication Date
CN108090485A true CN108090485A (en) 2018-05-29

Family

ID=62172999

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711216652.9A Pending CN108090485A (en) 2017-11-28 2017-11-28 Display foreground extraction method based on various visual angles fusion

Country Status (1)

Country Link
CN (1) CN108090485A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846397A (en) * 2018-05-31 2018-11-20 浙江科技学院 A kind of cable semi-conductive layer automatic testing method based on image procossing
CN109242968A (en) * 2018-08-24 2019-01-18 电子科技大学 A kind of river three-dimensional modeling method cut based on the super voxel figure of more attributes
CN109784374A (en) * 2018-12-21 2019-05-21 西北工业大学 Multi-angle of view clustering method based on adaptive neighbor point
CN112766387A (en) * 2021-01-25 2021-05-07 海尔数字科技(上海)有限公司 Error correction method, device, equipment and storage medium for training data
CN113393455A (en) * 2021-07-05 2021-09-14 武汉智目智能技术合伙企业(有限合伙) Machine vision technology-based foreign fiber detection method
CN114677573A (en) * 2022-05-30 2022-06-28 上海捷勃特机器人有限公司 Visual classification method, system, device and computer readable medium
CN115953780A (en) * 2023-03-10 2023-04-11 清华大学 Multi-dimensional light field complex scene graph construction method based on multi-view information fusion

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663411A (en) * 2012-02-29 2012-09-12 宁波大学 Recognition method for target human body
CN107527054A (en) * 2017-09-19 2017-12-29 西安电子科技大学 Prospect extraction method based on various visual angles fusion

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663411A (en) * 2012-02-29 2012-09-12 宁波大学 Recognition method for target human body
CN107527054A (en) * 2017-09-19 2017-12-29 西安电子科技大学 Prospect extraction method based on various visual angles fusion

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846397A (en) * 2018-05-31 2018-11-20 浙江科技学院 A kind of cable semi-conductive layer automatic testing method based on image procossing
CN109242968A (en) * 2018-08-24 2019-01-18 电子科技大学 A kind of river three-dimensional modeling method cut based on the super voxel figure of more attributes
CN109784374A (en) * 2018-12-21 2019-05-21 西北工业大学 Multi-angle of view clustering method based on adaptive neighbor point
CN112766387A (en) * 2021-01-25 2021-05-07 海尔数字科技(上海)有限公司 Error correction method, device, equipment and storage medium for training data
CN112766387B (en) * 2021-01-25 2024-01-23 卡奥斯数字科技(上海)有限公司 Training data error correction method, device, equipment and storage medium
CN113393455A (en) * 2021-07-05 2021-09-14 武汉智目智能技术合伙企业(有限合伙) Machine vision technology-based foreign fiber detection method
CN114677573A (en) * 2022-05-30 2022-06-28 上海捷勃特机器人有限公司 Visual classification method, system, device and computer readable medium
CN114677573B (en) * 2022-05-30 2022-08-26 上海捷勃特机器人有限公司 Visual classification method, system, device and computer readable medium
CN115953780A (en) * 2023-03-10 2023-04-11 清华大学 Multi-dimensional light field complex scene graph construction method based on multi-view information fusion

Similar Documents

Publication Publication Date Title
CN108090485A (en) Display foreground extraction method based on various visual angles fusion
CN107563381B (en) Multi-feature fusion target detection method based on full convolution network
CN104134234B (en) A kind of full automatic three-dimensional scene construction method based on single image
CN107527054A (en) Prospect extraction method based on various visual angles fusion
CN109344874A (en) A kind of automatic chromosome analysis method and system based on deep learning
Jidong et al. Recognition of apple fruit in natural environment
CN106296695A (en) Adaptive threshold natural target image based on significance segmentation extraction algorithm
CN107945179A (en) A kind of good pernicious detection method of Lung neoplasm of the convolutional neural networks of feature based fusion
US20110081081A1 (en) Method for recognizing objects in images
CN103914699A (en) Automatic lip gloss image enhancement method based on color space
CN110738676A (en) GrabCT automatic segmentation algorithm combined with RGBD data
CN107292259A (en) The integrated approach of depth characteristic and traditional characteristic based on AdaRank
CN110827312B (en) Learning method based on cooperative visual attention neural network
CN102147867B (en) Method for identifying traditional Chinese painting images and calligraphy images based on subject
CN106203237A (en) The recognition methods of container-trailer numbering and device
CN109949593A (en) A kind of traffic lights recognition methods and system based on crossing priori knowledge
CN109685045A (en) A kind of Moving Targets Based on Video Streams tracking and system
Li et al. An improved binocular localization method for apple based on fruit detection using deep learning
CN105469111A (en) Small sample set object classification method on basis of improved MFA and transfer learning
CN103778430B (en) Rapid face detection method based on combination between skin color segmentation and AdaBoost
Liu et al. Development of a machine vision algorithm for recognition of peach fruit in a natural scene
CN106650811A (en) Hyperspectral mixed pixel classification method based on neighbor cooperation enhancement
CN106529441A (en) Fuzzy boundary fragmentation-based depth motion map human body action recognition method
CN108154176A (en) A kind of 3D human body attitude algorithm for estimating for single depth image
CN114758132A (en) Fruit tree pest and disease identification method and system based on convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180529