CN108090485A - Display foreground extraction method based on various visual angles fusion - Google Patents
Display foreground extraction method based on various visual angles fusion Download PDFInfo
- Publication number
- CN108090485A CN108090485A CN201711216652.9A CN201711216652A CN108090485A CN 108090485 A CN108090485 A CN 108090485A CN 201711216652 A CN201711216652 A CN 201711216652A CN 108090485 A CN108090485 A CN 108090485A
- Authority
- CN
- China
- Prior art keywords
- pixel
- mrow
- image
- super
- extracted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 68
- 230000000007 visual effect Effects 0.000 title claims abstract description 43
- 230000004927 fusion Effects 0.000 title claims abstract description 9
- 238000000034 method Methods 0.000 claims description 38
- 230000003993 interaction Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 8
- GJJFMKBJSRMPLA-HIFRSBDPSA-N (1R,2S)-2-(aminomethyl)-N,N-diethyl-1-phenyl-1-cyclopropanecarboxamide Chemical compound C=1C=CC=CC=1[C@@]1(C(=O)N(CC)CC)C[C@@H]1CN GJJFMKBJSRMPLA-HIFRSBDPSA-N 0.000 claims description 4
- 239000012141 concentrate Substances 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 4
- 239000000203 mixture Substances 0.000 claims description 4
- 238000006073 displacement reaction Methods 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000006243 chemical reaction Methods 0.000 description 6
- 238000005457 optimization Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000011229 interlayer Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of display foreground extraction methods based on various visual angles fusion, mainly solve the problems, such as existing cumbersome inaccurate with extraction foreground edge based on technology extraction process.Its implementation is:First SVM classifier is trained, then obtains the gray level image of image to be extracted;The subgraph for the prospect that includes is detected in gray level image by trained SVM classifier;Using position coordinates of the subgraph in image to be extracted as the input of GrabCut algorithms, foreground extraction is carried out to image to be extracted, obtains the extraction result under the pixel visual angle of image to be extracted;With SLIC algorithms to the image under image to be extracted generation super-pixel visual angle;Extraction result under image under super-pixel visual angle and pixel visual angle is merged, obtains display foreground extraction result to be extracted.This invention simplifies foreground extraction processes, improve the efficiency and precision of extraction, are identified available for stereoscopic vision, image, semantic, three-dimensional reconstruction and picture search.
Description
Technical field
The invention belongs to technical field of image processing, further relate to a kind of display foreground based on various visual angles fusion certainly
Dynamic extracting method, the present invention can be used for stereoscopic vision, image, semantic to identify, the application and research of picture search etc..
Background technology
Foreground extraction is a kind of means for extracting interesting target in the picture.It divide the image into several are specific,
Region with unique properties simultaneously proposes the technology and process of interesting target, and has become from image procossing to image point
The committed step of analysis.Specific explanations are to divide the image into several complementations according to features such as gray scale, color, texture and shapes to overlap
Region, and these features is made to show similitude in the same area, and apparent difference is showed between different zones.Through
The development and variation of decades have been crossed, foreground extraction has gradually formed the scientific system of oneself, and new extracting method emerges in an endless stream,
Already become a field interdisciplinary, and cause the researcher of every field and the extensive concern of application personage,
Such as medical domain, airborne and spaceborne RS field, industrial detection, security protection and military field etc..
Current foreground extracting method mainly includes the foreground extracting method based on threshold value, the foreground extraction side based on edge
Method, the foreground extracting method based on region, the foreground extracting method based on figure cutting, the foreground extracting method based on energy functional
With the display foreground extracting method based on deep learning etc..Foreground extracting method wherein based on figure cutting is because extraction accuracy
Height, easy to operate and favored, the foreground extracting method based on figure cutting is a kind of combined optimization method based on graph theory, root
According to the interactive information of user, piece image is mapped to a network by it, and establishes the energy function on label, with most
The iteration that big stream minimal cut algorithm carries out network limited number of time is cut, and obtains the minimal cut of network, the prospect as image
Extract result.But because the presence of human-computer interaction, when being extracted to multiple image, manual operation load is too big, limits it
Application in engineering.For example, Meng Tang et al. 2013 are in 2013IEEE International Conference on
It is delivered on Computer Vision《GrabCut in One Cut》, foreground area is selected by user, then by prospect institute
Area maps for figure, by One Cut to mapping graph carry out limited number of time iteration cutting, obtain the foreground extraction of image as a result,
But human-computer interaction calibration prospect region is needed, cause foreground extraction process comparatively laborious, and the energy of limited number of time changes
Generation optimization can only obtain the minimal cut of more excellent solution, it is difficult to obtain accurate foreground edge.
The content of the invention
It is an object of the invention to be directed to the deficiency of above-mentioned prior art, it is proposed that a kind of image based on various visual angles fusion
Prospect extraction method, for solving in the existing foreground extracting method based on figure cutting, because the presence of human-computer interaction is led
The problem of foreground edge caused by the comparatively laborious iterative energy optimization with limited number of time of foreground extraction process of cause is inaccurate.
To achieve the above object, the technical solution that the present invention takes includes as follows:
(1) SVM classifier is trained, obtains trained SVM classifier;
(2) gray processing is carried out to image to be extracted, obtains gray level image;
(3) by trained SVM classifier, the subgraph p for including foreground target is detected in gray level imagek;
(3a) uses multiple dimensioned window, is slided, obtained by multiple line by line according to being spaced in gray level image for setting
Image set P={ the p of subgraph composition1,p2,...pk,...,pq, wherein, k ∈ [1, q], pkFor k-th of subgraph, q is subgraph
The quantity of picture;
Each subgraph p in (3b) extraction image set PkHistograms of oriented gradients HOG features, and be entered into and train
SVM classifier in classify, subgraph p is calculatedkLabel lpk;
(3c) judges subgraph pkLabel lpkWhether it is just, if so, subgraph pkComprising foreground target, subgraph is recorded
As pkIn the position of image to be extracted, i.e. subgraph pkThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and
The pixel in the lower right corner is in the corresponding position (x of image to be extractedmax,ymax), step (4) is performed, otherwise, abandons image pk;
(4) foreground extraction is carried out to image to be extracted:
Using subgraph pkThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and the lower right corner pixel
In the corresponding position (x of image to be extractedmax,ymax), the human-computer interaction of GrabCut algorithms is replaced, and is tied using replacing
Fruit carries out foreground extraction to image to be extracted, obtains the extraction result S under the pixel visual angle of image to be extracted1(x,y);
(5) super-pixel of image to be extracted is calculated using simple linear Iterative Clustering SLIC, obtains super-pixel visual angle
Under image:B={ b1,b2,...,bi,...,bm, i ∈ [1, m], biFor i-th of super-pixel, m is the quantity of super-pixel;
(6) to the extraction result S under the image B and the pixel visual angle of image to be extracted under super-pixel visual angle1(x, y) is carried out
Various visual angles merge, and obtain display foreground S to be extracted2(xi,yi)。
Compared with prior art, the present invention it has the following advantages that:
1) present invention is using the subgraph where prospect in the SVM classifier acquisition image to be extracted of training, and uses son
The rectangular area that the human-computer interaction that position coordinates of the image in image to be extracted replaces GrabCut algorithms obtains is defeated as algorithm
Enter, realize the foreground extraction to image to be extracted, fully combine SVM classifier and GrabCut algorithms, figure can be automatically performed
As foreground extraction process, solve in the existing foreground extracting method based on figure cutting, because caused by the presence of human-computer interaction
The problem of foreground extraction process is comparatively laborious is effectively improved the efficiency of display foreground extraction.
2) present invention carries out super-pixel extraction using SLIC algorithms to image to be extracted, takes full advantage of one in super-pixel block
The characteristics of cause property is preferable by being merged to the extraction result under the image under super-pixel visual angle and pixel visual angle, can obtain
The prospect of image to be extracted accurately extraction as a result,
3) present invention makes foreground extraction result more accurate by introducing super-pixel, smoothly, solves and existing is cut based on figure
In the foreground extracting method cut, because the problem of foreground edge caused by the iterative energy optimization of limited number of time is inaccurate, improves
The precision of display foreground extraction.
Description of the drawings
Fig. 1 is the realization flow chart of the present invention;
Fig. 2 is that the sample image in the present invention assembles composition;
Fig. 3 is the schematic diagram that HOG features are extracted in the present invention;
Fig. 4 is to the visual presentation figure of HOG features in the present invention;
Fig. 5 be by the use of the present invention to pedestrian, leaf as prospect extraction experimental result picture.
Specific embodiment
Below in conjunction with the drawings and specific embodiments, the present invention is described in further detail.
With reference to Fig. 1, based on the display foreground extraction method of various visual angles fusion, comprise the following steps:
Step 1, SVM classifier is trained.
(1a) gathers the sample graph image set containing prospect classification, and carries out gray processing to all sample images therein, obtains
To sample gray-scale map image set;
The structure chart of sample graph image set containing prospect classification is as shown in Fig. 2, it includes positive sample, negative sample and sample marks
Sign file, wherein positive sample be the image comprising prospect, negative sample be the image not comprising prospect, sample label file for pair
The classification and storage location of positive sample and negative sample illustrate;
All sample images concentrated to sample image carry out gray processing, are by the triple channel in sample image
Red component R, green component G, blue component B are weighted averagely, obtain the gray level image gray value Gray of sample image:
Gray=R × 0.299+G × 0.587+B × 0.114;
(1b) extraction sample gray level image concentrates the histograms of oriented gradients HOG features of each image:
With reference to Fig. 3, this step is implemented as follows:
Input picture is divided by (1b1) connects several adjacent and nonoverlapping units, and pixel is calculated in each unit
Gradient magnitude G (x, y) and gradient direction α (x, y), calculation formula be respectively:
Wherein, Gx(x, y)=H (x+1, y)-H (x-1, y) represents the horizontal direction at pixel (x, y) in input picture
Gradient, GyVertical gradient in (x, y)=H (x, y+1)-H (x, y-1) expression input pictures at pixel (x, y), H (x,
Y) pixel value in input picture at pixel (x, y) is represented;
All gradient direction α (x, y) are divided into 9 angles, as the transverse axis of histogram, each angular range by (1b2)
The longitudinal axis that corresponding Grad adds up as histogram, obtains histogram of gradients;
(1b3) counts the histogram of gradients of each unit, obtains the Feature Descriptor of each unit;
8 × 8 units are formed a block by (1b4), and the Feature Descriptor of all units, obtains the block in a block of connecting
Histograms of oriented gradients HOG Feature Descriptors;
(1b5) connects all pieces in input picture of histograms of oriented gradients HOG Feature Descriptors, and it is defeated to obtain this
Enter visual presentation figure such as Fig. 4 institutes of the histograms of oriented gradients HOG features of image, wherein histograms of oriented gradients HOG features
Show, wherein Fig. 4 (a) be example master drawing, the histograms of oriented gradients HOG characteristic patterns of Fig. 4 (b) example master drawings, from fig. 4, it can be seen that side
The presentation and shape of localized target are described well to histogram of gradients HOG features by gradient or edge direction density;
Sample gray level image is concentrated the histograms of oriented gradients HOG features of all input pictures to connect by (1b6), is obtained
To the histograms of oriented gradients HOG feature sets of the sample gray-scale map image set;
(1c) uses histograms of oriented gradients HOG features all in sample orientation histogram of gradients HOG feature sets to SVM
Grader is trained, and obtains trained SVM classifier.
Step 2, gray processing is carried out to image to be extracted, obtains gray level image.
By the red component R ' of the triple channel in image to be extracted, green component G ', blue component B ' threes are weighted
It is average, obtain the gray value Gray ' of each pixel in image to be extracted:
Gray '=R ' × 0.299+G ' × 0.587+B ' × 0.114
The gray level image of image to be extracted is worth to according to the gray scale of each pixel.
Step 3, using trained SVM classifier, detected in the gray level image of image to be extracted comprising foreground target
Subgraph pi, obtain subgraph piThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and the lower right corner picture
Element is in the corresponding position (x of image to be extractedmax,ymax)。
(3a) uses multiple dimensioned window, is slided line by line according to being spaced in the gray level image of image to be extracted for setting
It is dynamic, obtain the image set being made of multiple subgraphs:P={ p1,p2,...pk,...,pq, wherein, pkFor k-th of subgraph, q
For the quantity of subgraph;
Each subgraph p in (3b) extraction image set PkHistograms of oriented gradients HOG features, and be entered into and train
SVM classifier in classify, obtain subgraph pkLabel lpk,lpkCalculation formula be:
Wherein, kFor k-th of normal vector of the hyperplane of SVM classifier, xkFor subgraph 0kHistograms of oriented gradients
HOG features, k ∈ [1, q], q are subgraph number, and φ is the displacement item of the hyperplane of SVM classifier.
(3c) judges subgraph pkLabel lpkWhether it is just, if so, subgraph pkComprising foreground target, subgraph is recorded
As pkIn the position of image to be extracted, i.e. subgraph pkThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and
The pixel in the lower right corner is in the corresponding position (x of image to be extractedmax,ymax), and with subgraph piThe pixel in the upper left corner is to be extracted
Corresponding position (the x of imagemin,ymin) and the lower right corner pixel image to be extracted corresponding position (xmax,ymax), composition includes
The rectangular area of prospect performs step 4, otherwise, abandons subgraph pk。
Step 4, foreground extraction is carried out to image to be extracted.
Foreground extracting method mainly includes the foreground extracting method based on threshold value, the foreground extracting method based on edge, base
Foreground extracting method in region, the foreground extracting method based on figure cutting, foreground extracting method and base based on energy functional
In display foreground extracting method of deep learning etc..This example is used but is not limited in the foreground extracting method based on figure cutting
Foreground extraction is carried out based on GrabCut algorithms, specific implementation is:Using subgraph piThe pixel in the upper left corner and the lower right corner is being treated
Extract the corresponding position (x of imagemin,ymin), (xmax,ymax) and input of the image to be extracted as GrabCut algorithms, it treats and carries
Image is taken to carry out foreground extraction, obtains the extraction result S under the pixel visual angle of image to be extracted1(x,y)。
Step 5, the super-pixel of image to be extracted is calculated.
The super-pixel of image to be extracted is calculated, existing method includes the method based on graph theory and the side declined based on gradient
Method;This example is used but is not limited to carry out super-pixel extraction based on SLIC algorithms in the method declined based on gradient;Specific step
It is rapid as follows:
Image to be extracted is transformed into CIE-Lab color spaces by (5a) from RGB color, obtains CIE-Lab images;
The image to be extracted is transformed into CIE-Lab color spaces from RGB color, wherein between RGB and LAB
There is no direct conversion formula, must be by the use of XYZ color space as interlayer, conversion formula is:
Wherein, r, g, b are three channel components of image pixel to be extracted, and R, G, B is r, the corrected function of g, b
Three channel components after gamma (t) corrections,
The X of XYZ color space, Y, Z tri- passages point are obtained by the conversion formula of RGB color to XYZ color space
Amount;Its conversion formula is:
Wherein,For the matrix of one 3 × 3;
In CIE-Lab color spaces, CIE-Lab is obtained to CIE-Lab color space conversion formula by XYZ color space
The value of color space L, a, b triple channel, conversion formula are:
L=116f (Y/Yn)
B=500 (f (X/Xn)-f(Y/Yn))
A=200 (f (Y/Yn)-f(Z/Zn))
Wherein, X, Y, Z are RGB color to the transformed triple channel component of XYZ color space;Xn, Yn, ZnValue point
It Wei 95.047,100.0,108.883;f(X/Xn), f (Y/Yn), f (Z/Zn) calculated by such as minor function:
(5b) initializes the cluster centre of super-pixel:Super-pixel number m=200 is set, according to super in CIE-Lab images
Number of pixels uniformly distributes super-pixel cluster centre, obtains cluster centre collectionWherein,For the ith cluster center after the d times iteration, common m is a, wherein, li,ai,biFor CIE-Lab face
Three passages of the colour space, (xi,yi) it is biCoordinate;
(5c) sets label l (pixel)=- 1 and distance d (pixel) to each pixel p ixel of CIE-Lab images
=∞;
(5d) calculates cluster centre collection C respectivelydMiddle cluster centre3 × 3 neighborhoods in all pixels point Grad, and
Cluster centre is moved on on the pixel of the field inside gradient minimum, obtain new cluster centre collection Cd+1;
(5e) is for cluster centre collection CdIn each cluster centre2S × 2S in it is every
One pixel p ixel=[lp,ap,bp,xp,yp], it calculatesWith the distance D (pixel) of pixel:
WhereinColor distortion between pixel,
Space length between pixel,
M is dcMaximum,N is image all pixels point number, and m is the super-pixel number of setting;
(5f) compares the size of d (pixel) and D (pixel), if D (pixel) < d (pixel), by D (pixel)
D (pixel) is assigned to, if d (pixel)=D (pixel), i.e., record the pixel to cluster centre with d (pixel)Distance,
And with pixel tag l (pixel) pixel being marked to belong to i-th of super-pixel, l (pixel)=i obtains new super-pixel bi;
(5g) repeats step (5d)~(5f), updates cluster centre, until residual error convergence, obtains super-pixel figure
As B={ b1,b2,...,bi,...,bm}。
Step 6, to the extraction result S under the pixel visual angle of the image under super-pixel visual angle and image to be extracted1(x, y) into
Row various visual angles merge, and obtain display foreground S to be extracted2(xi,yi)。
(6a) is to the super-pixel b of the image B under super-pixel visual angleiComprising extraction knot of all pixels under pixel visual angle
Fruit S1Label l in (x, y)ijIt is weighted, obtains super-pixel biLabel confidence level Scorebi:
Scorebi=∑ lij;
(6b) sets confidence threshold value gate, by confidence threshold value gate and super-pixel biLabel confidence level ScorebiInto
Row compares, and obtains super-pixel biLabel l under visual anglebi, and by label lbiAs pixel (xi,yi) label S2(xi,
yi), super-pixel biThe label of interior all pixels is identical with the label of super-pixel, S2(xi,yi) it is display foreground to be extracted,
In (xi,yi)∈bi;
The setting confidence level gate is smaller, then super-pixel biIt is judged to that the probability of prospect is smaller, and gate is bigger, then super picture
Plain biIt is bigger to be judged to the probability of prospect, but when gate is excessive, excessive noise is had in foreground extraction result and is existed;
It is described by confidence threshold value gate and super-pixel biLabel confidence level ScorebiIt is compared, obtains super-pixel
biLabel lbi, comparison formula is:
Wherein, lbiFor super-pixel biLabel, numbiFor super-pixel biThe quantity of middle pixel, gate are confidence threshold value, 1
It is background label for prospect label, 0;
Described various visual angles fusion, refer to by the extraction result S under the image B and pixel visual angle under super-pixel visual angle1(x,
Y) merged.
It is tested below by way of foreground extraction, the technique effect of the present invention is described further:
1st, experiment condition and content
The experiment of the present invention respectively extracts pedestrian, leaf target, pedestrian that training data is looked at random for network, tree
Leaf image set, amount of images are respectively 736,186, take positive negative sample respectively to every width picture, then make label, respectively
Form sample graph image set, the sample graph image set of the classification containing leaf of the classification containing pedestrian.
By being programmed in MATLAB R2017a, realize and foreground extraction is carried out to pedestrian, leaf target, as a result such as Fig. 5 institutes
Show.
2nd, analysis of experimental results:
From figure 5 it can be seen that the 4 width images that two class data are tested respectively, noise is not present in the foreground extraction result of output,
And the foreground edge extracted is preferable, such as the foreground extraction to 4 width images of leaf classification as a result, the foreground edge extracted extremely
Accurately.There are preferable compatibility, such as input picture 3 to pedestrian's classification for the prospect integrity degree contained in input picture, with
Bust still can obtain preferable foreground extraction effect as input picture.
In addition, from figure 5 it can be seen that the present invention can be automatically performed after SVM classifier completes training to image to be extracted
Prospect automated procedure obtains the foreground extraction of image to be extracted as a result, solving the existing foreground extracting method based on figure cutting
In, it is necessary to the problem of human-computer interaction assisted extraction, while the present invention takes full advantage of the characteristics of uniformity is preferable in super-pixel block,
The edge of extraction result under the pixel visual angle of GrabCut algorithms output is repaired, makes foreground extraction result more accurate,
Smoothly, accurate foreground extraction is obtained as a result, improving foreground extraction precision.
Claims (8)
1. a kind of display foreground extraction method based on various visual angles fusion, it is characterised in that:
(1) SVM classifier is trained, obtains trained SVM classifier;
(2) gray processing is carried out to image to be extracted, obtains gray level image;
(3) by trained SVM classifier, the subgraph p for including foreground target is detected in gray level imagek;
(3a) uses multiple dimensioned window, is slided, obtained by multiple subgraphs line by line according to being spaced in gray level image for setting
As the image set P={ p of composition1,p2,...pk,...,pq, wherein, k ∈ [1, q], pkFor k-th of subgraph, q is subgraph
Quantity;
Each subgraph p in (3b) extraction image set PkHistograms of oriented gradients HOG features, and be entered into trained SVM
Classify in grader, subgraph p is calculatedkLabel lpk;
(3c) judges subgraph pkLabel lpkWhether it is just, if so, subgraph pkInclude foreground target, record subgraph pk
In the position of image to be extracted, i.e. subgraph pkThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and it is right
The pixel of inferior horn is in the corresponding position (x of image to be extractedmax,ymax), step (4) is performed, otherwise, abandons image pk;
(4) foreground extraction is carried out to image to be extracted:
Using subgraph pkThe pixel in the upper left corner is in the corresponding position (x of image to be extractedmin,ymin) and the pixel in the lower right corner treating
Extract the corresponding position (x of imagemax,ymax), the human-computer interaction of GrabCut algorithms is replaced, and utilizes and replaces result pair
Image to be extracted carries out foreground extraction, obtains the extraction result S under the pixel visual angle of image to be extracted1(x,y);
(5) super-pixel of image to be extracted is calculated using simple linear Iterative Clustering SLIC, is obtained under super-pixel visual angle
Image:B={ b1,b2,...,bi,...,bm, i ∈ [1, m], biFor i-th of super-pixel, m is the quantity of super-pixel;
(6) to the extraction result S under the image B and the pixel visual angle of image to be extracted under super-pixel visual angle1(x, y) is regarded more
Angle is merged, and obtains display foreground S to be extracted2(xi,yi)。
2. according to the method described in claim 1, be wherein trained in step (1) to SVM classifier, as follows into
Row:
(1a) gathers the sample graph image set containing prospect classification, and carries out gray processing to all sample images therein, obtains sample
This gray-scale map image set;
(1b) extraction sample gray level image concentrates the histograms of oriented gradients HOG features of each image, and it is straight to obtain sample orientation gradient
Side's figure HOG feature sets;
(1c) uses histograms of oriented gradients HOG features all in sample orientation histogram of gradients HOG feature sets to svm classifier
Device is trained, and obtains trained SVM classifier.
3. according to the method described in claim 2, it is characterized in that:All sample graphs concentrated in step (1a) to sample image
It is by the triple channel red component R in sample image as carrying out gray processing, green component G, blue component B, which are weighted, to be averaged,
Obtain the gray value Gray of sample gray level image:
Gray=R × 0.299+G × 0.587+B × 0.114.
4. according to the method described in claim 2, it is characterized in that:Extraction sample gray level image concentrates each figure in step (1b)
The HOG features of picture, carry out in accordance with the following steps:
Input picture is divided by (1b1) connects several adjacent and nonoverlapping units, and the ladder of pixel is calculated in each unit
Spend amplitude G (x, y) and gradient direction α (x, y):
<mrow>
<mi>G</mi>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<msqrt>
<mrow>
<msub>
<mi>G</mi>
<mi>x</mi>
</msub>
<msup>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
<mn>2</mn>
</msup>
<mo>+</mo>
<msub>
<mi>G</mi>
<mi>y</mi>
</msub>
<msup>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
<mn>2</mn>
</msup>
</mrow>
</msqrt>
</mrow>
<mrow>
<mi>&alpha;</mi>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<msup>
<mi>tan</mi>
<mrow>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msup>
<mrow>
<mo>(</mo>
<mfrac>
<mrow>
<msub>
<mi>G</mi>
<mi>x</mi>
</msub>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
</mrow>
<mrow>
<msub>
<mi>G</mi>
<mi>y</mi>
</msub>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
</mrow>
</mfrac>
<mo>)</mo>
</mrow>
</mrow>
Wherein, Gx(x, y)=H (x+1, y)-H (x-1, y), Gy(x, y)=H (x, y+1)-H (x, y-1) represents input picture respectively
Horizontal direction gradient and vertical gradient at middle pixel (x, y), H (x+1, y) represent input picture in pixel (x+1,
Y) pixel value at place, H (x-1, y) represent the pixel value at pixel (x-1, y) in input picture, and H (x, y+1) represents input figure
Pixel value as at pixel (x, y+1), H (x, y-1) represent the pixel value at pixel (x, y-1) in input picture;
All gradient direction α (x, y) are divided into 9 angles by (1b2), and as the transverse axis of histogram, each angular range institute is right
The longitudinal axis that the Grad answered adds up as histogram, obtains histogram of gradients;
(1b3) counts the histogram of gradients of each unit, obtains the Feature Descriptor of each unit;
N × n unit is formed a block by (1b4), and the Feature Descriptor of all units, obtains the block in a block of connecting
HOG Feature Descriptors;
All pieces of histograms of oriented gradients HOG Feature Descriptors, obtain the side of the input picture in (1b5) series connection input picture
To histogram of gradients HOG features;
(1b6) series connection sample gray level image concentrates the histograms of oriented gradients HOG features of all input pictures, obtains sample ash
Spend the histograms of oriented gradients HOG feature sets of image set.
5. according to the method described in claim 1, it is characterized in that:Step (3b) is fallen into a trap operator image pkLabel lpk, pass through
The following formula carries out:
Wherein,For k-th of normal vector of the hyperplane of SVM classifier, xkFor subgraph pkHistograms of oriented gradients HOG it is special
Sign, k ∈ [1, q], q are subgraph number, and φ is the displacement item of the hyperplane of SVM classifier.
6. according to the method described in claim 1, it is characterized in that:The super-pixel of image to be extracted is calculated in step (5), is realized
Step is:
Image to be extracted is transformed into CIE-Lab color spaces by (5a) from RGB color, obtains CIE-Lab images;
(5b) initializes the cluster centre of super-pixel:Super-pixel number is set, it is equal according to super-pixel number in CIE-Lab images
Even distribution super-pixel cluster centre, obtains cluster centre collectionWherein,For the ith cluster center after the d times iteration, common m is a, wherein, li,ai,biFor CIE-Lab face
Three passages of the colour space, (xi,yi) it is biCoordinate;
(5c) to each pixel p ixel of CIE-Lab images, set label l (pixel)=- 1 and distance d (pixel)=
∞;
(5d) calculates cluster centre collection C respectivelydMiddle cluster centreN × n fields in all pixels point Grad, and will be poly-
Class center is moved on on the pixel of the field inside gradient minimum, obtains new cluster centre collection Cd+1;
(5e) is for cluster centre collection CdIn each cluster centre2S × 2S in each
Pixel p ixel=[lp,ap,bp,xp,yp], it calculatesWith the distance D (pixel) of pixel:
<mrow>
<mi>D</mi>
<mrow>
<mo>(</mo>
<mi>p</mi>
<mi>i</mi>
<mi>x</mi>
<mi>e</mi>
<mi>l</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<msqrt>
<mrow>
<msubsup>
<mi>d</mi>
<mi>c</mi>
<mn>2</mn>
</msubsup>
<mo>+</mo>
<msup>
<mrow>
<mo>(</mo>
<mfrac>
<msub>
<mi>d</mi>
<mi>s</mi>
</msub>
<mi>S</mi>
</mfrac>
<mo>)</mo>
</mrow>
<mn>2</mn>
</msup>
<msup>
<mi>M</mi>
<mn>2</mn>
</msup>
</mrow>
</msqrt>
</mrow>
WhereinColor distortion between pixel,
Space length between pixel,
M is dcMaximum,N is image all pixels point number, and m is the super-pixel number of setting;
(5f) compares the size of d (pixel) and D (pixel), if D (pixel) < d (pixel), D (pixel) is assigned to d
(pixel), l (pixel)=i obtains new super-pixel bi;
(5g) constantly performs step (5d)~(5f), updates cluster centre, until residual error convergence, obtains super-pixel image B
={ b1,b2,...,bi,...,bm}。
7. according to the method described in claim 1, wherein to the image B under super-pixel visual angle and image to be extracted in step (6)
Pixel visual angle under extraction result S1(x, y) carries out various visual angles fusion and carries out as follows:
(6a) is to the super-pixel b of super-pixel visual angle hypograph BiComprising extraction result S of all pixels under pixel visual angle1(x,
Y) the label l inijIt is weighted, obtains super-pixel biLabel confidence level Scorebi;
(6b) sets confidence threshold value gate, by confidence threshold value gate and super-pixel biLabel confidence level ScorebiCompared
Compared with obtaining super-pixel biLabel l under visual anglebi:
<mrow>
<msub>
<mi>l</mi>
<mrow>
<mi>b</mi>
<mi>i</mi>
</mrow>
</msub>
<mo>=</mo>
<mfenced open = "{" close = "">
<mtable>
<mtr>
<mtd>
<mn>1</mn>
</mtd>
<mtd>
<mrow>
<mi>i</mi>
<mi>f</mi>
</mrow>
</mtd>
<mtd>
<mrow>
<msub>
<mi>Score</mi>
<mrow>
<mi>b</mi>
<mi>i</mi>
</mrow>
</msub>
<mo>></mo>
<msub>
<mi>num</mi>
<mrow>
<mi>b</mi>
<mi>i</mi>
</mrow>
</msub>
<mo>/</mo>
<mi>g</mi>
<mi>a</mi>
<mi>t</mi>
<mi>e</mi>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mn>0</mn>
</mtd>
<mtd>
<mrow>
<mi>i</mi>
<mi>f</mi>
</mrow>
</mtd>
<mtd>
<mrow>
<msub>
<mi>Score</mi>
<mrow>
<mi>b</mi>
<mi>i</mi>
</mrow>
</msub>
<mo><</mo>
<msub>
<mi>num</mi>
<mrow>
<mi>b</mi>
<mi>i</mi>
</mrow>
</msub>
<mo>/</mo>
<mi>g</mi>
<mi>a</mi>
<mi>t</mi>
<mi>e</mi>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
</mrow>
Wherein, numbiFor super-pixel biThe quantity of middle pixel, 1 is prospect label, and 0 is background label.
(6c) is by label lbiAs pixel (xi,yi) label S2(xi,yi), the S2(xi,yi) be image to be extracted before
Scape, wherein (xi,yi)∈bi。
8. according to the method described in claim 7, it is characterized in that:Super-pixel b in step (6a)iLabel confidence level
Scorebi, it is the super-pixel b to super-pixel visual angle hypograph BiComprising extraction result S of all pixels under pixel visual angle1
Label in (x, y) is summed, i.e.,:
Scorebi=∑ lij
Wherein, ScorebiFor super-pixel biLabel confidence level.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711216652.9A CN108090485A (en) | 2017-11-28 | 2017-11-28 | Display foreground extraction method based on various visual angles fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711216652.9A CN108090485A (en) | 2017-11-28 | 2017-11-28 | Display foreground extraction method based on various visual angles fusion |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108090485A true CN108090485A (en) | 2018-05-29 |
Family
ID=62172999
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711216652.9A Pending CN108090485A (en) | 2017-11-28 | 2017-11-28 | Display foreground extraction method based on various visual angles fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108090485A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846397A (en) * | 2018-05-31 | 2018-11-20 | 浙江科技学院 | A kind of cable semi-conductive layer automatic testing method based on image procossing |
CN109242968A (en) * | 2018-08-24 | 2019-01-18 | 电子科技大学 | A kind of river three-dimensional modeling method cut based on the super voxel figure of more attributes |
CN109784374A (en) * | 2018-12-21 | 2019-05-21 | 西北工业大学 | Multi-angle of view clustering method based on adaptive neighbor point |
CN112766387A (en) * | 2021-01-25 | 2021-05-07 | 海尔数字科技(上海)有限公司 | Error correction method, device, equipment and storage medium for training data |
CN113393455A (en) * | 2021-07-05 | 2021-09-14 | 武汉智目智能技术合伙企业(有限合伙) | Machine vision technology-based foreign fiber detection method |
CN114677573A (en) * | 2022-05-30 | 2022-06-28 | 上海捷勃特机器人有限公司 | Visual classification method, system, device and computer readable medium |
CN115953780A (en) * | 2023-03-10 | 2023-04-11 | 清华大学 | Multi-dimensional light field complex scene graph construction method based on multi-view information fusion |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102663411A (en) * | 2012-02-29 | 2012-09-12 | 宁波大学 | Recognition method for target human body |
CN107527054A (en) * | 2017-09-19 | 2017-12-29 | 西安电子科技大学 | Prospect extraction method based on various visual angles fusion |
-
2017
- 2017-11-28 CN CN201711216652.9A patent/CN108090485A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102663411A (en) * | 2012-02-29 | 2012-09-12 | 宁波大学 | Recognition method for target human body |
CN107527054A (en) * | 2017-09-19 | 2017-12-29 | 西安电子科技大学 | Prospect extraction method based on various visual angles fusion |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846397A (en) * | 2018-05-31 | 2018-11-20 | 浙江科技学院 | A kind of cable semi-conductive layer automatic testing method based on image procossing |
CN109242968A (en) * | 2018-08-24 | 2019-01-18 | 电子科技大学 | A kind of river three-dimensional modeling method cut based on the super voxel figure of more attributes |
CN109784374A (en) * | 2018-12-21 | 2019-05-21 | 西北工业大学 | Multi-angle of view clustering method based on adaptive neighbor point |
CN112766387A (en) * | 2021-01-25 | 2021-05-07 | 海尔数字科技(上海)有限公司 | Error correction method, device, equipment and storage medium for training data |
CN112766387B (en) * | 2021-01-25 | 2024-01-23 | 卡奥斯数字科技(上海)有限公司 | Training data error correction method, device, equipment and storage medium |
CN113393455A (en) * | 2021-07-05 | 2021-09-14 | 武汉智目智能技术合伙企业(有限合伙) | Machine vision technology-based foreign fiber detection method |
CN114677573A (en) * | 2022-05-30 | 2022-06-28 | 上海捷勃特机器人有限公司 | Visual classification method, system, device and computer readable medium |
CN114677573B (en) * | 2022-05-30 | 2022-08-26 | 上海捷勃特机器人有限公司 | Visual classification method, system, device and computer readable medium |
CN115953780A (en) * | 2023-03-10 | 2023-04-11 | 清华大学 | Multi-dimensional light field complex scene graph construction method based on multi-view information fusion |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108090485A (en) | Display foreground extraction method based on various visual angles fusion | |
CN107563381B (en) | Multi-feature fusion target detection method based on full convolution network | |
CN104134234B (en) | A kind of full automatic three-dimensional scene construction method based on single image | |
CN107527054A (en) | Prospect extraction method based on various visual angles fusion | |
CN109344874A (en) | A kind of automatic chromosome analysis method and system based on deep learning | |
Jidong et al. | Recognition of apple fruit in natural environment | |
CN106296695A (en) | Adaptive threshold natural target image based on significance segmentation extraction algorithm | |
CN107945179A (en) | A kind of good pernicious detection method of Lung neoplasm of the convolutional neural networks of feature based fusion | |
US20110081081A1 (en) | Method for recognizing objects in images | |
CN103914699A (en) | Automatic lip gloss image enhancement method based on color space | |
CN110738676A (en) | GrabCT automatic segmentation algorithm combined with RGBD data | |
CN107292259A (en) | The integrated approach of depth characteristic and traditional characteristic based on AdaRank | |
CN110827312B (en) | Learning method based on cooperative visual attention neural network | |
CN102147867B (en) | Method for identifying traditional Chinese painting images and calligraphy images based on subject | |
CN106203237A (en) | The recognition methods of container-trailer numbering and device | |
CN109949593A (en) | A kind of traffic lights recognition methods and system based on crossing priori knowledge | |
CN109685045A (en) | A kind of Moving Targets Based on Video Streams tracking and system | |
Li et al. | An improved binocular localization method for apple based on fruit detection using deep learning | |
CN105469111A (en) | Small sample set object classification method on basis of improved MFA and transfer learning | |
CN103778430B (en) | Rapid face detection method based on combination between skin color segmentation and AdaBoost | |
Liu et al. | Development of a machine vision algorithm for recognition of peach fruit in a natural scene | |
CN106650811A (en) | Hyperspectral mixed pixel classification method based on neighbor cooperation enhancement | |
CN106529441A (en) | Fuzzy boundary fragmentation-based depth motion map human body action recognition method | |
CN108154176A (en) | A kind of 3D human body attitude algorithm for estimating for single depth image | |
CN114758132A (en) | Fruit tree pest and disease identification method and system based on convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180529 |