CN109919159A - A kind of semantic segmentation optimization method and device for edge image - Google Patents
A kind of semantic segmentation optimization method and device for edge image Download PDFInfo
- Publication number
- CN109919159A CN109919159A CN201910059828.7A CN201910059828A CN109919159A CN 109919159 A CN109919159 A CN 109919159A CN 201910059828 A CN201910059828 A CN 201910059828A CN 109919159 A CN109919159 A CN 109919159A
- Authority
- CN
- China
- Prior art keywords
- image
- pixel
- super
- segmentation
- semantic segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Image Analysis (AREA)
Abstract
The present invention relates to a kind of semantic segmentation optimization methods for edge image, comprising: chooses image data;Utilize the training of described image data and authentication image semantic segmentation model and full condition of contact random field models;The semantic segmentation result of image is obtained using the described image semantic segmentation model after training;The super-pixel segmentation result of image edge information is obtained using super-pixel segmentation algorithm;Using semantic segmentation described in the super-pixel segmentation result optimizing as a result, forming the first optimum results;Optimize first optimum results using the full condition of contact random field models after training.Method proposed by the present invention, the high-level semantics information in image can be efficiently extracted, retain image edge information by super-pixel segmentation algorithm, existing parted pattern is improved to the semantic segmentation accuracy rate of image border by local edge optimization algorithm, it realizes flexible, it is compatible strong, there is stronger robustness.
Description
Technical field
The invention belongs to technical field of image processing, and in particular to a kind of semantic segmentation optimization method for edge image
And device.
Background technique
With computer science system constantly improve and the continuous development of multimedia and Internet technology, as meter
Important branch in calculation machine subject is also gradually incorporating each of modern society using Digital Image Processing as the computer vision of representative
A corner.Image, semantic segmentation is one of basic problem important in computer vision, its target is each picture to image
Vegetarian refreshments is classified, and divides an image into several regions, between region respectively it is independent without be overlapped and all have it is respective
Visual meaningaaa, and their different visual tags are given, in favor of subsequent image analysis and visual analysis.From macroscopic perspective
For, the semantic segmentation of image can be regarded as the pre-processing process of scene understanding task, and scene understanding is always to calculate
The key problem of machine visual field, the demand with people to semantic information is obtained from the multimedia mediums such as image/video are more next
More, image, semantic segmentation becomes ever more important.From microcosmic point, the target of semantic segmentation is to realize Pixel-level classification,
Method is individually classified to each pixel, and obtained result is then the semantic label of entire image.From practical application
For aspect, image, semantic segmentation is accomplished that the segmentation of target and the two aspect task of identification of target.
Since the middle and later periods in last century, researchers are just in the research for being dedicated to image, semantic segmentation, by half
The accumulation in more centuries, scholars are directed to different scenes, propose numerous different semantic segmentation algorithms.Thresholding method is image
One of most basic method in segmentation field, principle are the differences of the color or gray value according to pixel in image, are carried out to image
Segmentation.But this method is general to semantic recognition effect, disadvantage also clearly, when the gray value of pixel in image be closer to or
When color distinction is little, error probability is higher, and since the setting of threshold value in practical situations will receive noise and illumination
Influence, it is desirable to obtain suitable threshold value be it is very difficult, lead to the narrow scope of application of algorithm.Method based on edge detection
It is another kind of conventional segmentation methods, basic thought is using the feature inconsistency between region, present in detection image
Then all points are connected into line according to set strategy, until constituting enclosed region by marginal point.When image border gray value
Clearly, while in the presence of image is almost without noise, the effect of acquisition is relatively good for variation, but works as the edge complex time-division
It is not satisfactory to cut effect.Therefore this kind of partitioning algorithm is suitble to the gray value transition of segmenting edge obvious, general image noise
Lesser image.Interactive image segmentation is a kind of dividing method that (Graph partioning) thought is divided based on figure.It calculates
Method needs are artificially given a clue to distinguish different classifications, common are two sorting algorithms, for example artificial frame of typical interactive mode
Foreground target or the boundary in prospect background draw lines out, and algorithm can be using the information artificially added as constraint, automatically later
Generate segmentation result.But interactive dividing method needs artificial intervention, as one can imagine, it is a small amount of that such mode is only suitable for processing
Picture, if there is the image under large amount of complex scene, frequently artificial label is cumbersome and time consuming.The dividing method of cluster leads to
The gray value of pixel in movement images is crossed, wherein will be divided into same class by the lesser pixel of difference, it is this to divide pixel polymerization again
The method of class is known as cluster segmentation method.Although needing not be provided priori knowledge, feature extraction and identification are not needed, reduces semanteme
The difficulty of segmentation, still, since clustering algorithm extremely relies on the selection of initial seed point, different initialization results be will cause point
The greatest differences of result are cut, while being easier to the object that erroneous judgement color difference is close but belongs to a different category, generally speaking its semanteme point
The accuracy rate cut is not high.
Thereafter, probability graph model (Probabilistic Graphical Models, PGM) progresses into researchers'
The visual field, main includes generating model (Generative Models) and discrimination model (Discrimitive Models).Condition
Representative of the random field models (Condition Random Fields, CRFs) as discrimination model is another different from generating
The typical probability graph model of model, CRF model can indicate the relationship between observational variable, including color and positional relationship etc..
This model is undoubtedly successful, it already becomes one of current most widely used image, semantic parted pattern.Later grinds
The persons of studying carefully also proposed many improved models, such as full condition of contact random field models on this basis, it can carry out language
While justice segmentation, considers detailed information such as texture, global context and the smoothing prior of image bottom, be obviously improved
The effect of semantic segmentation.
Traditional image processing algorithm is required to process each pixel, and flourishing with multimedia technology,
The image touched in life is more and more clear, and resolution ratio is higher and higher, and traditional algorithm is in time complexity and space complexity
Aspect all suffers from challenge.Algorithm process process is to a large amount of consumption of resource so that researchers have to find new solution party
Method, therefore super-pixel segmentation is come into being, and by the way that the problem of Pixel-level is transferred to region class, can reduce operation time
Meanwhile reducing EMS memory occupation.In general, super-pixel refers to the set of a kind of similar pixel point.In common image, surpass
Pixel is often made of a part of some object, due to belonging to an object, so rudimentary in color, position and texture etc.
Characteristic aspect is all more similar, is cut between block and block by natural edge fate, is not overlapped mutually.Due to super-pixel segmentation algorithm
The computational complexity of total algorithm can be greatly reduced in high efficiency, the pre-treatment step for being used for subsequent image treatment process.
And possess the Partial Feature of object due to dividing obtained super-pixel, figure can be obtained by combining with the detailed information of bottom
The structural information of picture, at present super-pixel segmentation by more and more application fields for example image segmentation, target detection, image classification and
Target identification etc. is used as committed step.
Traditional images semantic segmentation algorithm is mostly the rudimentary clue that is carried using image pixel itself as foundation, directly according to
Certain strategy carrys out segmented image.Because being directly to be split pixel, do not need to do algorithm any parameter adjustment, so
Traditional algorithm is all relatively simple, but goes for the auxiliary that acceptable segmentation result would have to depend on artificial information.
Currently, some traditional algorithms are often used as image preprocessing or post-processing, it is used cooperatively with neural network.It is swept across in deep learning
Under the big tide in the whole world, the researchers of computer vision field take advantage of a favourable situation and are, obtain repeatly in terms of image, semantic cutting techniques good
Achievement, full convolutional network (Fully Convolutional Networks, FCN) model with milestone significance are to be born
Under such overall situation.Although existing image, semantic cutting techniques have reached pretty good in integrally segmentation accuracy rate
Level, but single target can not be usually accurately positioned in existing method.If having overlapping or other complexity between object
Situation is blocked, and existing semantic segmentation algorithm can not be handled well it, generally showing the result is that more
Adhesion between a object, edge cannot clearly recognize that object also can be mis-marked as other classifications.
Summary of the invention
In order to solve the above-mentioned problems in the prior art, the present invention provides a kind of semantemes for edge image point
Cut optimization method and device.The technical problem to be solved in the present invention is achieved through the following technical solutions:
The embodiment of the invention provides a kind of semantic segmentation optimization methods for edge image, comprising:
Choose image data;
Utilize the training of described image data and authentication image semantic segmentation model and full condition of contact random field models;
The semantic segmentation result of image is obtained using the described image semantic segmentation model after training;
The super-pixel segmentation result of image edge information is obtained using super-pixel segmentation algorithm;
Using semantic segmentation described in the super-pixel segmentation result optimizing as a result, forming the first optimum results;
Optimize first optimum results using the full condition of contact random field models after training.
In one embodiment of the invention, the selection image data, comprising:
It chooses one of VOC data set and Cityscapes data set and is used as image data.
In one embodiment of the invention, described to utilize the training of described image data and authentication image semantic segmentation model
With full condition of contact random field models, comprising:
Described image data are divided into training set and verifying collection;
Using the training set as input, pass through iteration supervised training described image semantic segmentation model and the full connection
Conditional random field models;
It is described complete after the verifying to be collected to the described image semantic segmentation model and training as input, after verifying training
Condition of contact random field models.
In one embodiment of the invention, described image semantic segmentation model is FCN-8s model.
In one embodiment of the invention, the super-pixel segmentation algorithm is SLIC super-pixel segmentation algorithm.
In one embodiment of the invention, described to utilize semantic segmentation knot described in the super-pixel segmentation result optimizing
Fruit forms the first optimum results, comprising:
Semantic label distribution is carried out to the super-pixel segmentation result, forms feature tag;
The feature tag is optimized using local edge optimization algorithm, forms the first optimum results.
Another embodiment of the present invention provides a kind of semantic segmentation optimization device for edge image, and feature exists
In, comprising:
Data decimation module, for choosing image data;
Training authentication module, for utilizing the training of described image data and authentication image semantic segmentation model and full connection strap
Part random field models;
Semantic segmentation module, for obtaining the semantic segmentation knot of image using the described image semantic segmentation model after training
Fruit;
Super-pixel segmentation module, for obtaining the super-pixel segmentation knot of image edge information using super-pixel segmentation algorithm
Fruit;
First optimization module, for utilizing semantic segmentation described in the super-pixel segmentation result optimizing as a result, forming first
Optimum results;
Second optimization module, for optimizing first optimization using the full condition of contact random field models after training
As a result.
Compared with prior art, beneficial effects of the present invention:
1. the present invention is a kind of semantic segmentation optimization method for edge image, it is intended to optimize the segmentation knot of existing algorithm
Fruit in specific implementation process, can use different super-pixel segmentation algorithms according to specific needs, realize that flexibly, compatibility is strong.
Aspect of performance can not only efficiently extract the high-level semantics information in image, and can utilize image low-level information Accurate Segmentation
Image border, while there is stronger robustness to propagated error.
2. the present invention retains the object edge in image using super-pixel, existing point is promoted by local edge optimization algorithm
Model is cut to the semantic segmentation accuracy rate of image border.
3. the present invention constrains similar pixel in color and spatial position using full condition of contact random field, sufficiently
Using relationship between the pixel of image, thus advanced optimize semantic segmentation as a result, image border is made to obtain more accurately dividing.
Detailed description of the invention
Fig. 1 is a kind of process signal of semantic segmentation optimization method for edge image provided in an embodiment of the present invention
Figure;
Fig. 2 is a kind of implementation process frame of the semantic segmentation optimization method for edge image provided in an embodiment of the present invention
Figure;
Fig. 3 is that local edge is excellent in a kind of semantic segmentation optimization method for edge image provided in an embodiment of the present invention
Change the implementation process block diagram of algorithm;
Fig. 4 is to be based on super-pixel in a kind of semantic segmentation optimization method for edge image provided in an embodiment of the present invention
Edge optimization effect picture and partial enlarged view;
Fig. 5 is that full connection is utilized in a kind of semantic segmentation optimization method for edge image provided in an embodiment of the present invention
The effect of optimization figure and partial enlarged view that the precise edge that condition random field is realized restores;
Fig. 6 is a kind of semantic segmentation optimization method for edge image provided in an embodiment of the present invention and existing image
Segmentation accuracy rate of the semantic segmentation method on VOC data set compares histogram;
Fig. 7 is a kind of semantic segmentation optimization method for edge image provided in an embodiment of the present invention and existing image
Segmentation accuracy rate of the semantic segmentation method on Cityscapes data set compares histogram.
Specific embodiment
Further detailed description is done to the present invention combined with specific embodiments below, but embodiments of the present invention are not limited to
This.
Embodiment one
Referring to Figure 1 and Fig. 2, Fig. 1 are that a kind of semantic segmentation for edge image provided in an embodiment of the present invention optimizes
The flow diagram of method, Fig. 2 are a kind of semantic segmentation optimization method for edge image provided in an embodiment of the present invention
Implementation process block diagram.
The embodiment of the invention provides a kind of semantic segmentation optimization methods for edge image, comprising:
Choose image data;
Utilize image data training and authentication image semantic segmentation model and full condition of contact random field models;
The semantic segmentation result of image is obtained using the image, semantic parted pattern after training;
The super-pixel segmentation result of image edge information is obtained using super-pixel segmentation algorithm;
Using super-pixel segmentation result optimizing semantic segmentation as a result, forming the first optimum results;
Optimize the first optimum results using the full condition of contact random field models after training.
Specifically, in the embodiment of the present invention, the data set conduct of suitable popularity is all had in different field using two
Image data, and using the standard division mode of data set.It is VOC data set first, it substantially has become synthesis
Assess the benchmark dataset of new semantic segmentation algorithm.VOC data set includes 21 semantic classes, wherein including 20 prospect classes
Other and 1 background classification.1449 pictures that the VOC data set of standard is concentrated comprising 1464 pictures in training set, verifying
With 1456 pictures in test set, wherein training set and verifying collection all include the true semantic label of Pixel-level, are respectively used to
Training, verifying and test.Another data set is Cityscapes data set, it is a urban landscape data set, its side
The semantic understanding for overweighting city streetscape is divided into 19 semantic classes.The Cityscapes data set of standard has 2975 pictures
As training set, 500 pictures are as verifying collection, and 1525 pictures are as test set.
After choosing image data, need to be trained image, semantic parted pattern and full condition of contact random field models
And verifying.
Image, semantic parted pattern uses FCN model in the present embodiment, and the coarse spy in image is extracted by FCN model
Sign, different from traditional convolutional neural networks model, FCN model can input the image of arbitrary size and generate of corresponding size
Output, obtains Pixel-level classification results.FCN model can be converted by existing convolutional neural networks, FCN used herein
Model is converted by the VGG-16 in VGGNet series.
It is that the full articulamentum in former network is replaced with into convolution by the specific practice that VGG-16 network is converted to FCN model
Layer, while retaining preceding five-layer structure.In entire characteristic extraction procedure, after the operation of the convolution sum pondization of successive ignition, obtain
The resolution ratio of the Feature Mapping arrived is lower and lower.In order to which final output is reverted to image identical with input picture size, need
Up-sampling operation is carried out to centre output.During specific implementation, relatively primitive input picture, the feature of final output
The resolution ratio of mapping reduces 2,4,8,16 and 32 times respectively.Directly the rough features of the last layer output adopt on 32 times
Sample can be obtained by FCN-32s's as a result, still due to amplification factor cause greatly very much FCN-32s output image lack it is many thin
Section, so its result is not accurate enough.In order to improve accuracy, rear several layers of more details information is needed to be added to FCN-
It is gone in 32s, by the way that more details information in conjunction with the output phase of FCN-32s, can further be obtained FCN-16s and FCN-8s
Result.
Image data is divided into training set image and verifying collection image in the present embodiment, using training set image and its really
Semantic label mainly exercises supervision training to FCN-8s model, in order to which the characteristics of image for arriving e-learning is more advanced abstract, this
Example has specifically carried out 50 iteration supervised trainings, i.e., at the beginning of the model when model after the completion of preceding primary training is as training next time
Initial value.Mode in the present embodiment using cross validation determines several hyper parameters of full condition of contact random field.This implementation first
ω is set in example2And σγTwo values, for the two parameters, their influences for nicety of grading are simultaneously little, more
It is to influence flatness, initial value ω is set2=1, σγ=1, but according to test result, ω is finally set in the present embodiment2=3,
ω2=3.For ω1、σαAnd σβThese three hyper parameters have been used in the present embodiment and a kind of have been searched for by coarse to fine optimal value
Strategy.The present embodiment selects a small amount of picture to scan on training dataset, and the initial value of these three parameters is set as ω1=
3, σα=30, σβ=3.Initial search frequency range is set as ω1∈ [3:6], σα∈ [30:10:100] and σβ∈ [3:6], respectively indicates
ω1And σβIt is to search 6 from 3, is incremented by 1 every time;σαIt is to search 100 from 30, is incremented by 10 every time.After one wheel search, optimal
Search is re-started in the range of where value, incremental stepping halves until finally searching for stopping, capable of guaranteeing this implementation in this way
The condition random field parameters being arranged in example are optimized parameters.By search, three values used in the present embodiment are respectively ω1=
5, σα=49, σβ=3.
After completing training image semantic segmentation model and full condition of contact random field models, need according to the figure after training
As semantic segmentation model, image, semantic segmentation result is obtained.
Specifically, in the FCN-8s model after inputting an image into training, the semantic segmentation result of image is obtained.One side
Face, due to the difference of receptive field, after former secondary convolution operations, resolution ratio is relatively high, and pixel classifications are less accurate, but to every
The contrast locating of a pixel is more accurate.On the other hand, in last convolution several times, resolution ratio is relatively low, and the positioning of pixel is not
It is enough accurate, but the classification of pixel is more accurate.The receptive field of FCN-8s model is small, is suitble to experience details, the result of FCN-8s
Closest to true semantic label.But the result of FCN-8s is still not sensitive enough for the details of image, therefore is called coarse
As a result.
After obtaining the semantic segmentation result of coarse image, need using super-pixel segmentation algorithm, obtaining includes image side
The super-pixel segmentation result of edge information.
This example treats segmentation figure in order to obtain the better super-pixel of edge compactness, using SLIC super-pixel segmentation algorithm
As carrying out super-pixel segmentation, since the resolution sizes of image differ, so, during actual super-pixel segmentation, this reality
For example according to the Different Dynamic adjusting parameter of image resolution ratio, adjustable strategies are to guarantee that the pixel quantity that each super-pixel includes is kept
In the range of [200,500].For example, for the image of 500 × 500 sizes, this example sets 1000 for super-pixel quantity;
For 1024 × 2048 resolution ratio, 6000 are set by super-pixel quantity;Specific step is as follows:
SLIC algorithm is substantially a kind of part K-means clustering algorithm.It is assumed that in image pixel total Np, it is expected that
Super-pixel number be Ns.In the plane of delineation, using pixel as basic unit, since Row row, horizontal direction and vertical
N is equably chosen using S as step-length in directionsA initial cluster center.Wherein Row is equal to the 1/2 of S, the calculation formula of step-length are as follows:
In order to improve the formation speed of super-pixel, in the local rectangular window that size is 2S × 2S, SLIC algorithm will be every
A pixel is distributed to it apart from nearest cluster centre.Choose the spy of gray value g and location information (x, y) composition pixel
Levy vector.Assuming that the feature vector of any pixel is fi=[gi,xi,yi] and any cluster centre feature vector be fc=
[gc,xc,yc], then pixel piWith cluster centre pcDistance DsCalculation formula are as follows:
Wherein, dgAnd dxyRespectively Gray homogeneity and space length, S are step-length, and m is control super-pixel compactness and rule
The parameter of degree.The super-pixel of the bigger generation of parameter m value is more regular, usual value range k '=500,1000,1500,
2000,2500}.Gray homogeneity dgWith space length dxyCalculation formula be respectively as follows:
Wherein, giAnd gcIt is pixel p respectivelyiWith cluster centre pcGray value, xiAnd xcIt is pixel p respectivelyiWith it is poly-
Class center pcIn the coordinate value of X-direction, yiAnd ycIt is pixel p respectivelyiWith cluster centre pcIn the coordinate value of Y direction.
The process of SLIC super-pixel segmentation algorithm are as follows:
1, in the plane of delineation, using pixel as basic unit, with S for step-length both vertically and horizontally, from the
Row row pixel starts, and equably chooses NsA cluster centre.
2, in order to avoid cluster centre is fallen at the edge pixel point or noise pixel point of image, in each cluster centre
In Ns × Ns neighborhood, the gradient value of each pixel is calculated, chooses the smallest pixel of gradient value as new cluster centre.
3, set iteration variable θ, and be initialized as 0, and in the search window of 2S × 2S, by pixel distribute to and its
Apart from the smallest cluster centre, R cluster similar pixel point is obtained.
4, the characteristics of mean vector for calculating separately all pixels point in every cluster similar pixel point updates every cluster similar pixel point
Cluster centre.
5, judge whether iteration variable θ is greater than iteration variable threshold value Ω, if then algorithm terminates and obtains NsA super-pixel
(every cluster similar pixel point is a super-pixel), otherwise iteration variable θ executes step 3 from increasing 1.
Empirical data suggests that the iteration 10 times cluster centre errors that can be realized twice in succession is only needed to be no more than 5%, because
This, generally sets the number of iterations to 10 times.
Due to the edge of the edge fitting image of the super-pixel of generation, obtained super-pixel can be to the edge of image
Information is described well.
After obtaining the super-pixel segmentation result comprising image edge information, need excellent according to above-mentioned super-pixel segmentation result
Change coarse image semantic segmentation result.
The core concept of edge optimization algorithm proposed by the present invention is using the Pixel-level characteristic pattern of FCN output to super picture
All pixels in element carry out semantic label distribution, there is the case where several classes are likely to occur, the following institute of pseudocode in this process
Show.According to whether including that image border is divided into two kinds of situations inside super-pixel, that is, there is edge and do not have edge.Include in super-pixel
In the case where edge, two kinds of situations can be divided into according to all pixels point semantic label whether having the same again.For description side
Just, image border will not be included inside single super-pixel, and the semantic label note having the same of all pixels point in super-pixel
For situation A, and image border will not be included in super-pixel, but there are the pixel in super-pixel a variety of semantic labels to be denoted as situation
B.It will include image border in super-pixel, but still there is all pixels point identical semantic label to be denoted as situation C, if in super-pixel
Comprising image border, and there are pixel multiple semantic labels to be then denoted as situation D, be divided in detail these types of situation below
Analysis.
As shown in figure 3, the basic step of local edge optimization algorithm are as follows:
1, assume that input picture is I, rough features L;
2, K super-pixel, R={ R are obtained using SLIC super-pixel segmentation algorithm1,R2,...,RK, wherein RiIndicate subscript
For the single super-pixel of i;
3, outer circulation: for i=1:K;
A, using M={ C1,C2,...,CKIndicate RiIn all pixels, wherein CjIndicate the picture for being marked as classification j
Element;
B, the feature of each pixel in C is obtained from front end, initializes weight WCIt is 0;
C, interior circulation: for j=1:N;
By CjFeature tag save asThen the weight of all labels in entire super-pixel is updated
WhereinIt indicatesIt is upper one value;
IfInterior circulation is then exited, is otherwise continued;
Terminate;
D, the whole W of searchCDetermine whether there is someValue be greater than 0.8.If it is present, then jumping in next step;It is no
Then, maximum W is continually looked formaxWith secondary big Wsub, then determine whether that the interpolation between them is greater than 0.2.If so, jumping to
In next step;Otherwise continue outer circulation;
E, with the classification of maximum likelihood in current super-pixelAgain the semantic label of current super-pixel is marked;
Terminate;
4, the output result after being optimized
For situation A, this kind of super-pixel is generally in background or the body region of jobbie, it is also possible in image
Clear and smooth marginal portion, since all pixels inside super-pixel impart identical semantic label by FCN model,
So not needing to optimize it, original semantic marker is directly continued to use.For situation B, because in such super-pixel
Also without image border, so be similar to situation A, super-pixel be likely to be in background or some object main region
In, it is also possible near smooth edges.The up-sampling operation of FCN model may cause propagated error, for example may will scheme
As adjacent edges background parts be labeled as other classifications, this is to lead to the main reason for situation B occur.For this super picture
Element, since the pixel of mistake classification belongs to minority, so herein assigning the semantic label for accounting for maximum ratio in super-pixel again
Whole pixels.
Situation C and situation D is that super-pixel is interior comprising image border, but the semantic label situation of pixel is different.In situation C
All pixels semantic label having the same illustrates that FCN model has been identified as the body region of some object, in the region
Itself has edge.For such super-pixel, way is still to continue to use existing semantic label, this is because FCN model is being taken out
When as high-level semantics feature, to brightness, miniature deformation, block situations such as with certain robustness, internal edge can
It can be the shade that illumination generates.Situation D is then the most complicated, and existing image border is assigned different again inside this kind of super-pixel
Semantic label such as tends to occur at some small structures of image or blocks, covers at the regional areas.For this kind of situation,
The present embodiment uses adaptive processing mode, if finding, the same semantic label has been assigned to 80% or more pixel,
Then using the semantic label as the semantic label of entire super-pixel, the semantic label of most of pixels is occupied if it does not exist, then is put
Abandon optimization to the region, this is because such case super-pixel can not effective district partial image edge, may if applying optimization
It can run counter to desire.
Image, semantic point after super-pixel segmentation result optimizing coarse image semantic segmentation result, after will form optimization
It cuts as a result, full condition of contact random field models is recycled to optimize the image, semantic segmentation result after optimization.
Specifically, after local edge optimizes, it is still necessary to improve under weak edge, small structure and complex scene
Semantic segmentation precision.Therefore, the present invention more accurately restores the edge of image using full condition of contact random field models, i.e.,
The segmentation effect of marginal portion in image is advanced optimized, and then promotes whole image, semantic and divides accuracy rate.
According to the basic theory of condition random field, consider label using as unit of pixel as stochastic variable, by pixel it
Between relationship as side, they just constitute a condition random field.These labels can after we obtain global observation
To be modeled, and global observation is often readily available, and is usually exactly input picture.Saying more specifically, full condition of contact
The input picture for possessing N number of pixel in random field has meant that global observation I.Then a figure G=(V, E) is given, V and E divide
Not Biao Shi figure vertex and side.If X is by stochastic variable { X1,X2,...,XNComposition vector, wherein XiIt is stochastic variable, it
The label of pixel i is distributed in expression.By input picture I and by the Pixel-level semantic segmentation figure of edge optimization, full connection is established
Conditional random field models are indicated with probability distribution P (X):
Wherein, E (x) is label x ∈ LNGibbs energy, Z (I) is segmentation function.Full condition of contact random field application
Energy function:
Wherein, ψu(xi) indicating unitary potential energy, it represents pixel i and is marked as label xiProbability, in the present embodiment
Unitary potential energy ψu(xi) be optimization after semantic segmentation result;ψp(xi,xj) indicating binary potential energy, it represents pixel i and pixel j
It is labeled as x simultaneouslyiAnd xjProbability, be shown below:
Wherein, IiAnd IjIndicate color vector, piAnd pjThen indicate location of pixels;Hyper parameter σα, σβAnd σγControl Gaussian kernel
The range of function;μ(xi,xj) it is label compatibility function, wherein if xi≠xj, then μ (xi,xj) it is equal to 1, otherwise μ (xi,xj)
Value is 0, it means that the neighbouring similar pixel for being assigned with different labels will receive punishment, and in other words, similar pixel is roused
Encourage the identical label of distribution, and " distance " widely different pixel is intended to be assigned different labels.For example, " road " and
" vehicle " the two objects appear in the probability on a picture simultaneously should be much larger than " meadow " and " vehicle " while the probability occurred.
Actual range between the definition and pixel color and pixel of " distance " is related, therefore full condition of contact random field can be as much as possible
The segmented image at edge division.
This example determines several hyper parameters of full condition of contact random field using the mode of cross validation.This example is set first
Set ω2And σγTwo values, it is more to influence that for the two parameters, their influences for nicety of grading are simultaneously little
Initial value ω is arranged in flatness2=1, σγ=1, but according to test result, ω is finally arranged in this example2=3, ω2=3.It is right
In ω1、σαAnd σβThese three hyper parameters, this example used it is a kind of by coarse to fine optimal value search strategy.This example exists
A small amount of picture is selected to scan on training dataset, the initial value of these three parameters is set as ω1=3, σα=30, σβ=3.
Initial search frequency range is set as ω1∈ [3:6], σα∈ [30:10:100] and σβ∈ [3:6], respectively indicates ω1And σβIt is to be searched from 3
Rope is incremented by 1 to 6 every time;σαIt is to search 100 from 30, is incremented by 10 every time.After one wheel search, in the range where optimal value
Inside re-start search, incremental stepping halve can guarantee in this way until finally searching for stopping the condition of this example setting with
Airport parameter is optimized parameter.By search, three values that this example uses are respectively ω1=5, σα=49, σβ=3.
Below in conjunction with l-G simulation test, effect of the invention is described further:
1, simulated conditions and content
The hardware simulation platform of the present embodiment are as follows: CPU is the memory of Intel's Intel Core [email protected], 8.0GB,
Video card is NVIDIA Titan Xp, video memory 12GB.
Emulation 1 selects a secondary image to be split, is entered into the FCN-8s model after training, while using SLIC
Super-pixel segmentation algorithm obtains the super-pixel segmentation of the image as a result, then using local edge Optimized model to the language of the image
Adopted segmentation result carries out edge optimization, and specific effect of optimization is as shown in Figure 4;
Emulation 2 continues to use the condition after training to the segmentation result after preliminary edge optimizes obtained in emulation 1
Random field models are advanced optimized, and comparing result is as shown in Figure 5;
Emulation 3 divides accuracy rate based on classification, with the present invention with the well-known semantic segmentation method of existing two class to VOC number
Accuracy rate comparative experiments is carried out according to the test image of concentration, as a result as shown in Figure 6;
Emulation 4 divides accuracy rate based on classification, with the present invention and the well-known semantic segmentation method pair of existing two class
Test image in Cityscapes data set carries out accuracy rate comparative experiments, as a result as shown in Figure 7.
2, analysis of simulation result
Referring to Fig. 4, it can be seen that there is many clear, smooth and prominent edge super-pixel to have bonded object well
The boundary of itself, the region of wherein most belong to situation A, i.e., for most pixel, have all continued to use FCN model
The semantic label distributed to pixel.And the case where classifying for mistake B, situation C and situation D, it can be from the box mark in Fig. 4
Local magnification region out finds.The most common mistake is that background pixel is mistakenly classified as to other classifications in situation B, and from putting
It can be seen that, optimization algorithm of the invention can effectively correct this kind of mistakes in big regional area.If can correct
The mistake of situation C can undoubtedly enhance the accuracy rate of semantic segmentation algorithm, there is such a large amount of super-pixel in figure in boxed area, and
Optimization algorithm of the invention is corrected the pixel of error label one by one.The super-pixel for belonging to situation D has different classifications
Semantic information, and the pixel quantity to belong to a different category is roughly the same, these super-pixel are generally present in the weak edge of image
Or in the complex environments such as small structure, it is easy to be classified by mistake.For this kind of super-pixel, the present invention does not divide again carelessly
Class, but the segmentation result that selection is provided with front end is consistent.According to Fig. 4's as a result, four kinds of situations mentioned above have
It is involved, it was demonstrated that edge optimization algorithm proposed by the present invention can carry out picture in super-pixel according to semantic label allocation strategy
Plain semantic label is redistributed.
It may be seen that following phenomenon in 3 local details lived referring to Fig. 5, box circle:
(1) the semantic segmentation result for only passing through super-pixel edge optimization is perfect not enough, and there are also the skies further promoted
Between, in addition the semantic segmentation result after CRF constraint has been closer to true semantic label.
(2) in large-scale picture structure, such as the part that train bottom is contacted with rail, object can not be accurately positioned in super-pixel
Body edge, local edge optimization algorithm have continued to use the label of FCN model as a result, and passing through after the precise edge recovery of CRF, energy
The pixel marked by FCN model errors is enough corrected, the pixel of mistake classification is classified as correct semantic classes again.
(3) for some small structures, such as the handrail and railing of train top and headstock, the amendment energy of super-pixel edge optimization
Power is limited, and the restriction ability of condition random field can obtain good effect, and the image border after making optimization is more bonded really
Object edge.
Referring to Fig. 6, it can be seen that, three kinds of algorithms have been more than 90% to the discrimination highest of background.In all categories
In, it is minimum to this kind of other recognition accuracies of chair, only 20%~35%;Three kinds of models to bird, public transport, cat, motorcycle,
The segmentation accuracy rate of the classifications such as people and train is in higher level.It can be seen that optimization algorithm proposed by the present invention is most of
Maximum performance is all obtained in classification.Especially compared with FCN-8s, the IoU score under nearly all classification, which has, significantly to be mentioned
The qualitative analysis of height, the above-mentioned segmentation result to the present invention and the existing semantic segmentation method based on full convolutional network shows this
Invention can inherit the extractability of the excellent image high-level semantics information of FCN model well, simultaneously because the model is sufficiently sharp
With location informations such as image borders, there is more accurate positioning to structures such as fine edge, fine cracks in image;
Referring to figure 6 and figure 7, it can be seen that the experimental result of Cityscapes data set and the result of VOC data set are still
It is consistent.Using optimization algorithm proposed by the present invention, compared with FCN-8s, IoU achievement is significantly improved.Fig. 6 and Fig. 7 exist
The quantitative contrast carried out in two datasets shows the algorithm proposed by the present invention using super-pixel progress edge optimization and makes
The method for carrying out near edge recovery with condition random field makes semantic segmentation accuracy obtain effective promotion, this illustrates to utilize
The low-level image information that traditional images partitioning algorithm obtains optimizes the coarse result of semantic segmentation, is to promote existing semanteme
Partitioning algorithm divides a kind of effective method of accuracy rate.
Another embodiment of the present invention provides a kind of semantic segmentation optimization device for edge image, and feature exists
In, comprising:
Data decimation module, for choosing image data;
Training authentication module, complete for using image data training and authentication image semantic segmentation model and condition of contact with
Airport model;
Semantic segmentation module, for obtaining the semantic segmentation result of image using the image, semantic parted pattern after training;
Super-pixel segmentation module, for obtaining the super-pixel segmentation knot of image edge information using super-pixel segmentation algorithm
Fruit;
First optimization module, for utilizing super-pixel segmentation result optimizing semantic segmentation as a result, forming the first optimum results;
Second optimization module, for optimizing the first optimum results using the full condition of contact random field models after training.
Method proposed by the present invention, it is intended to optimize the segmentation result of existing algorithm, it, can be according to tool in specific implementation process
Body needs to realize that flexibly, compatibility is strong using different super-pixel segmentation algorithms.Aspect of performance can not only efficiently extract figure
High-level semantics information as in, and image low-level information Accurate Segmentation image border can be utilized, while having to propagated error
Stronger robustness;Retain the object edge in image using super-pixel;Existing segmentation is promoted by local edge optimization algorithm
Model is to the semantic segmentation accuracy rate of image border, using full condition of contact random field to similar picture in color and spatial position
Element is constrained, and relationship between the pixel of image is made full use of, thus advanced optimize semantic segmentation as a result, obtaining image border
To more accurate segmentation.
The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that
Specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, exist
Under the premise of not departing from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to of the invention
Protection scope.
Claims (7)
1. a kind of semantic segmentation optimization method for edge image characterized by comprising
Choose image data;
Utilize the training of described image data and authentication image semantic segmentation model and full condition of contact random field models;
The semantic segmentation result of image is obtained using the described image semantic segmentation model after training;
The super-pixel segmentation result of image edge information is obtained using super-pixel segmentation algorithm;
Using semantic segmentation described in the super-pixel segmentation result optimizing as a result, forming the first optimum results;
Optimize first optimum results using the full condition of contact random field models after training.
2. the method according to claim 1, wherein described image data, comprising:
VOC data set or Cityscapes data set.
3. the method according to claim 1, wherein described utilize the training of described image data and authentication image language
Adopted parted pattern and full condition of contact random field models, comprising:
Described image data are divided into training set and verifying collection;
Using the training set as input, pass through iteration supervised training described image semantic segmentation model and the full condition of contact
Random field models;
The full connection after the verifying to be collected to the described image semantic segmentation model and training as input, after verifying training
Conditional random field models.
4. the method according to claim 1, wherein described image semantic segmentation model is FCN-8s model.
5. the method according to claim 1, wherein the super-pixel segmentation algorithm is the calculation of SLIC super-pixel segmentation
Method.
6. the method according to claim 1, wherein described utilize super-pixel segmentation result optimizing institute predicate
Adopted segmentation result forms the first optimum results, comprising:
Semantic label distribution is carried out to the super-pixel segmentation result, forms feature tag;
The feature tag is optimized using local edge optimization algorithm, forms the first optimum results.
7. a kind of semantic segmentation for edge image optimizes device characterized by comprising
Data decimation module, for choosing image data;
Training authentication module, complete for using the training of described image data and authentication image semantic segmentation model and condition of contact with
Airport model;
Semantic segmentation module, for obtaining the semantic segmentation result of image using the described image semantic segmentation model after training;
Super-pixel segmentation module, for obtaining the super-pixel segmentation result of image edge information using super-pixel segmentation algorithm;
First optimization module, for being optimized using semantic segmentation described in the super-pixel segmentation result optimizing as a result, forming first
As a result;
Second optimization module, for optimizing the first optimization knot using the full condition of contact random field models after training
Fruit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910059828.7A CN109919159A (en) | 2019-01-22 | 2019-01-22 | A kind of semantic segmentation optimization method and device for edge image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910059828.7A CN109919159A (en) | 2019-01-22 | 2019-01-22 | A kind of semantic segmentation optimization method and device for edge image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109919159A true CN109919159A (en) | 2019-06-21 |
Family
ID=66960467
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910059828.7A Pending CN109919159A (en) | 2019-01-22 | 2019-01-22 | A kind of semantic segmentation optimization method and device for edge image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109919159A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110796181A (en) * | 2019-06-28 | 2020-02-14 | 北京建筑大学 | Cultural relic disease high-precision automatic extraction method based on texture |
CN110796204A (en) * | 2019-11-01 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Video tag determination method and device and server |
CN110956221A (en) * | 2019-12-17 | 2020-04-03 | 北京化工大学 | Small sample polarization synthetic aperture radar image classification method based on deep recursive network |
CN111192279A (en) * | 2020-01-02 | 2020-05-22 | 上海交通大学 | Object segmentation method based on edge detection, electronic terminal and storage medium |
CN111489373A (en) * | 2020-04-07 | 2020-08-04 | 北京工业大学 | Occlusion object segmentation method based on deep learning |
CN111612802A (en) * | 2020-04-29 | 2020-09-01 | 杭州电子科技大学 | Re-optimization training method based on existing image semantic segmentation model and application |
CN111931782A (en) * | 2020-08-12 | 2020-11-13 | 中国科学院上海微***与信息技术研究所 | Semantic segmentation method, system, medium, and apparatus |
CN112084923A (en) * | 2020-09-01 | 2020-12-15 | 西安电子科技大学 | Semantic segmentation method for remote sensing image, storage medium and computing device |
CN112508128A (en) * | 2020-12-22 | 2021-03-16 | 北京百度网讯科技有限公司 | Training sample construction method, counting method, device, electronic equipment and medium |
CN112883898A (en) * | 2021-03-11 | 2021-06-01 | 中国科学院空天信息创新研究院 | Ground feature classification method and device based on SAR (synthetic aperture radar) image |
CN113673456A (en) * | 2021-08-26 | 2021-11-19 | 江苏省城市规划设计研究院有限公司 | Street view image scoring method based on color distribution learning |
CN113705371A (en) * | 2021-08-10 | 2021-11-26 | 武汉理工大学 | Method and device for segmenting aquatic visual scene |
CN115063639A (en) * | 2022-08-11 | 2022-09-16 | 小米汽车科技有限公司 | Method for generating model, image semantic segmentation method, device, vehicle and medium |
CN116543175A (en) * | 2023-07-06 | 2023-08-04 | 南通凯锐激光科技有限公司 | Automatic adjustment method of laser level |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104637045A (en) * | 2013-11-14 | 2015-05-20 | 重庆理工大学 | Image pixel labeling method based on super pixel level features |
WO2016016033A1 (en) * | 2014-07-31 | 2016-02-04 | Thomson Licensing | Method and apparatus for interactive video segmentation |
CN108734711A (en) * | 2017-04-21 | 2018-11-02 | 德尔福技术有限责任公司 | The method that semantic segmentation is carried out to image |
CN108764027A (en) * | 2018-04-13 | 2018-11-06 | 上海大学 | A kind of sea-surface target detection method calculated based on improved RBD conspicuousnesses |
-
2019
- 2019-01-22 CN CN201910059828.7A patent/CN109919159A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104637045A (en) * | 2013-11-14 | 2015-05-20 | 重庆理工大学 | Image pixel labeling method based on super pixel level features |
WO2016016033A1 (en) * | 2014-07-31 | 2016-02-04 | Thomson Licensing | Method and apparatus for interactive video segmentation |
CN108734711A (en) * | 2017-04-21 | 2018-11-02 | 德尔福技术有限责任公司 | The method that semantic segmentation is carried out to image |
CN108764027A (en) * | 2018-04-13 | 2018-11-06 | 上海大学 | A kind of sea-surface target detection method calculated based on improved RBD conspicuousnesses |
Non-Patent Citations (2)
Title |
---|
RADHAKRISHNA ACHANTA ET AL: ""SLIC Superpixels"", 《EPFL TECHNICAL REPORT》 * |
WEI ZHAO ET AL: ""An Improved Image Semantic Segmentation Method Based on Superpixels and Conditional Random Fields"", 《APPLIED SCIENCE》 * |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110796181A (en) * | 2019-06-28 | 2020-02-14 | 北京建筑大学 | Cultural relic disease high-precision automatic extraction method based on texture |
CN110796204A (en) * | 2019-11-01 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Video tag determination method and device and server |
CN110796204B (en) * | 2019-11-01 | 2023-05-02 | 腾讯科技(深圳)有限公司 | Video tag determining method, device and server |
CN110956221A (en) * | 2019-12-17 | 2020-04-03 | 北京化工大学 | Small sample polarization synthetic aperture radar image classification method based on deep recursive network |
CN111192279B (en) * | 2020-01-02 | 2022-09-02 | 上海交通大学 | Object segmentation method based on edge detection, electronic terminal and storage medium |
CN111192279A (en) * | 2020-01-02 | 2020-05-22 | 上海交通大学 | Object segmentation method based on edge detection, electronic terminal and storage medium |
CN111489373A (en) * | 2020-04-07 | 2020-08-04 | 北京工业大学 | Occlusion object segmentation method based on deep learning |
CN111489373B (en) * | 2020-04-07 | 2023-05-05 | 北京工业大学 | Occlusion object segmentation method based on deep learning |
CN111612802A (en) * | 2020-04-29 | 2020-09-01 | 杭州电子科技大学 | Re-optimization training method based on existing image semantic segmentation model and application |
CN111931782B (en) * | 2020-08-12 | 2024-03-01 | 中国科学院上海微***与信息技术研究所 | Semantic segmentation method, system, medium and device |
CN111931782A (en) * | 2020-08-12 | 2020-11-13 | 中国科学院上海微***与信息技术研究所 | Semantic segmentation method, system, medium, and apparatus |
CN112084923A (en) * | 2020-09-01 | 2020-12-15 | 西安电子科技大学 | Semantic segmentation method for remote sensing image, storage medium and computing device |
CN112084923B (en) * | 2020-09-01 | 2023-12-22 | 西安电子科技大学 | Remote sensing image semantic segmentation method, storage medium and computing device |
CN112508128A (en) * | 2020-12-22 | 2021-03-16 | 北京百度网讯科技有限公司 | Training sample construction method, counting method, device, electronic equipment and medium |
CN112508128B (en) * | 2020-12-22 | 2023-07-25 | 北京百度网讯科技有限公司 | Training sample construction method, counting device, electronic equipment and medium |
CN112883898A (en) * | 2021-03-11 | 2021-06-01 | 中国科学院空天信息创新研究院 | Ground feature classification method and device based on SAR (synthetic aperture radar) image |
CN113705371B (en) * | 2021-08-10 | 2023-12-01 | 武汉理工大学 | Water visual scene segmentation method and device |
CN113705371A (en) * | 2021-08-10 | 2021-11-26 | 武汉理工大学 | Method and device for segmenting aquatic visual scene |
CN113673456A (en) * | 2021-08-26 | 2021-11-19 | 江苏省城市规划设计研究院有限公司 | Street view image scoring method based on color distribution learning |
CN113673456B (en) * | 2021-08-26 | 2024-03-26 | 江苏省城市规划设计研究院有限公司 | Streetscape image scoring method based on color distribution learning |
CN115063639A (en) * | 2022-08-11 | 2022-09-16 | 小米汽车科技有限公司 | Method for generating model, image semantic segmentation method, device, vehicle and medium |
CN116543175A (en) * | 2023-07-06 | 2023-08-04 | 南通凯锐激光科技有限公司 | Automatic adjustment method of laser level |
CN116543175B (en) * | 2023-07-06 | 2023-08-29 | 南通凯锐激光科技有限公司 | Automatic adjustment method of laser level |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109919159A (en) | A kind of semantic segmentation optimization method and device for edge image | |
CN110619369B (en) | Fine-grained image classification method based on feature pyramid and global average pooling | |
CN106504233B (en) | Unmanned plane inspection image electric power widget recognition methods and system based on Faster R-CNN | |
CN111640125B (en) | Aerial photography graph building detection and segmentation method and device based on Mask R-CNN | |
CN103049763B (en) | Context-constraint-based target identification method | |
CN107341517A (en) | The multiple dimensioned wisp detection method of Fusion Features between a kind of level based on deep learning | |
CN110321815A (en) | A kind of crack on road recognition methods based on deep learning | |
CN111753828B (en) | Natural scene horizontal character detection method based on deep convolutional neural network | |
CN108509978A (en) | The multi-class targets detection method and model of multi-stage characteristics fusion based on CNN | |
CN102663382B (en) | Video image character recognition method based on submesh characteristic adaptive weighting | |
CN105574063A (en) | Image retrieval method based on visual saliency | |
Kadam et al. | Detection and localization of multiple image splicing using MobileNet V1 | |
CN105528575B (en) | Sky detection method based on Context Reasoning | |
CN109800698A (en) | Icon detection method based on depth network | |
CN111738055B (en) | Multi-category text detection system and bill form detection method based on same | |
CN109446922B (en) | Real-time robust face detection method | |
CN106529532A (en) | License plate identification system based on integral feature channels and gray projection | |
Jiao et al. | A survey of road feature extraction methods from raster maps | |
CN110502655B (en) | Method for generating image natural description sentences embedded with scene character information | |
CN106055653A (en) | Video synopsis object retrieval method based on image semantic annotation | |
CN105224937A (en) | Based on the semantic color pedestrian of the fine granularity heavily recognition methods of human part position constraint | |
CN101540000A (en) | Iris classification method based on texture primitive statistical characteristic analysis | |
CN108921152A (en) | English character cutting method and device based on object detection network | |
CN106250909A (en) | A kind of based on the image classification method improving visual word bag model | |
CN111460927A (en) | Method for extracting structured information of house property certificate image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190621 |