CN110322445A - A kind of semantic segmentation method based on maximization prediction and impairment correlations function between label - Google Patents
A kind of semantic segmentation method based on maximization prediction and impairment correlations function between label Download PDFInfo
- Publication number
- CN110322445A CN110322445A CN201910505928.8A CN201910505928A CN110322445A CN 110322445 A CN110322445 A CN 110322445A CN 201910505928 A CN201910505928 A CN 201910505928A CN 110322445 A CN110322445 A CN 110322445A
- Authority
- CN
- China
- Prior art keywords
- label
- picture
- prediction
- function
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of based on the semantic segmentation method for maximizing impairment correlations function between prediction and label, comprising: (1) real scene picture is inputted parted pattern, obtain predicted pictures;(2) sliding convolution is carried out in predicted pictures and label picture with a gaussian kernel function, obtain partial statistics characteristic;(3) according to obtained partial statistics characteristic, the linear dependence calculated in predicted pictures and label picture between corresponding region is strong and weak;(4) it using the index of linear dependence power as weight, adjusts the value of the intersection entropy loss of pixel in picture and carries out difficult sample excavation;(5) weight parameter in parted pattern is updated according to the penalty values of acquisition;(6) it steps be repeated alternatively until that training terminates, and carry out the application of semantic segmentation.Using the present invention, parted pattern can be made in the training process, more paying close attention to those leads to the point of low correlation between prediction and label, to promote the image segmentation of parted pattern.
Description
Technical field
The invention belongs to the image, semantics in computer vision to divide field, more particularly, to one kind based on maximization prediction
The semantic segmentation method of impairment correlations function between label.
Background technique
Semantic segmentation is a basic problem of computer vision field, in unmanned, medical imaging analysis, geographical letter
Scene is had a wide range of applications in the fields such as breath system, robot.In practice, the semantic segmentation of image is usually considered to be image
More classification problems at midpoint, target are each pixels distributed to set semantic label in image.In recent years, with volume
The development of product neural network and the proposition of the various parted patterns with stronger learning ability, semantic segmentation problem achieve very big
Progress.Under normal circumstances, these models are trained and optimize by minimizing the average Classification Loss of pixel.Most
Common semantic segmentation loss function is softmax cross entropy loss function:
Wherein, N is the number of pixel in picture, and C is the class number of object to be sorted, and y ∈ { 0,1 } is class label,
The true classification of pixel is represent, p ∈ [0,1] is the probability of parted pattern prediction, and p is usually operated by softmax and provided.From
Above formula can be seen that intersection entropy loss pixel-by-pixel for the point in image as mutually independent sample, and by all the points
Total losses of the average cross entropy loss as model prediction result.However, point in image is there are very strong dependence, these
The structural information of the under cover object of dependence between points.Since loss function pixel-by-pixel has ignored between points
Relationship, when the visual signature of prospect is fainter or when pixel belongs to the object with smaller space structure, by pixel-by-pixel
Loss function supervise trained semantic segmentation model segmentation effect it is usually not satisfactory.
In order to using the structural information of object included in image, the 26th neural information processing systems in 2012 into
Open up the article " Efficient on conference Conference on Neural Information Processing Systems
Inference in Fully Connected CRFs with Gaussian Edge Potentials " propose it is a kind of high
The full condition of contact random field (Conditional Random Field, CRF) of effect come be fitted in image between points
Relationship, and drive has the prediction result of the point of similar visual appearance more consistent in true picture.But in CRF quilt
When as a post-processing step, it usually has time-consuming iteration reasoning process and changes to visual appearance sensitive.
In seminar Conference of the 30th neural information processing systems progress conference in 2016 about confrontation study
On on Neural Information Processing Systems Workshop on Adversarial Training
Article " Semantic Segmentation using Adversarial Networks " is proposed with confrontation learning network (GAN)
Thought train parted pattern, judge the predicted pictures and label picture of parted pattern with an additional arbiter network
Whether there is high-level structural integrity.However, GAN is generally difficult to train, and more memory is needed in the training stage
Come while storing the generator network and arbiter network of deep layer.
In European Computer vision international conference European Conference on Computer in 2018
Article " Adaptive Affinity Fields for Semantic Segmentation " on Vision proposes one kind
It is associated with neighborhood and loses (Affinity Field Loss) function, this loss function belongs to the neighbour of same category object to those
The prediction for occupying point applies a convergent power, so that the prediction of these points tends to similar;To the neighbours for being not belonging to same type objects
The prediction of point applies the power of a dispersion, so that the prediction of these points tends to dissimilar.It is possible thereby to increase the similar point of neighbours
Prediction similarity and neighbours inhomogeneity point dissimilar degree, reach preferable segmentation effect.However this method is being counted
When calculating the value of loss function, needs to save neighbours' point to matrix, generally require and be several times as much as calculating needed for loss function value originally
Memory headroom.
Summary of the invention
Based on the deficiencies of the prior art, the invention proposes one kind based on maximization prediction and impairment correlations letter between label
Several semantic segmentation methods, the correlation maximization between the predicted pictures for exporting parted pattern and label picture, to make two
Person reaches higher structural similarity, improves the segmentation effect of parted pattern.
A kind of semantic segmentation method based on maximization prediction and impairment correlations function between label, comprising:
(1) real scene picture is inputted into parted pattern, obtains predicted pictures;
(2) sliding convolution is carried out in predicted pictures and label picture with a gaussian kernel function, it is special to obtain partial statistics
Sign, mean value and variance including part;
(3) according to obtained partial statistics characteristic, the linear phase in predicted pictures and label picture between corresponding region is calculated
Closing property is strong and weak;
(4) using the index of linear dependence power as weight, the value of intersection entropy loss of pixel in picture is adjusted simultaneously
Difficult sample is carried out to excavate;
(5) the structural penalties function of difficult sample in each trained batch is calculated, and is further calculated for Optimized Segmentation
The total losses function of model updates the weight parameter in parted pattern;
(6) repeat the above steps (1) to step (5), terminates training after reaching default frequency of training, and training is finished
The application of model progress semantic segmentation.
For two pictures or picture block x and y, the common form of structural similarity index S SIM is as follows:
Wherein, 3 parts are the measurement of intensity of illumination similarity, the measurement of contrast similarity and structural similarity respectively
Measurement, μx、σxAnd σxyIt is the covariance of the mean value of x, the variance of x and x and y, C respectively1、C2And C3It is each for stablizing
The positive number of component, their value are minimum.As constraint C3=C2When/2, and it can further obtain another reduced form of SSIM.
As can be seen that the key that SSIM can measure picture structure similitude is its third part in from the equations above, and this
Part is actually the Pearson correlation coefficient between variable x and y:
However, SSIM is not appropriate for being used directly to the loss function as semantic segmentation model, because in semantic segmentation
In context, SSIM is not a convex function, therefore it is not easy to optimize, and model may can't converge on a part
Minimum point.Based on the above analysis, the invention proposes the maximization predicted pictures and label picture phase that are suitable for semantic segmentation
The structural penalties function of closing property.
In step (2), use a standard deviation for 1.5 gaussian kernel function w={ wi| i=1,2 ..., k2(the value of weight
1 is normalized to,) estimate partial statistics characteristic:
Wherein, μyWithThe respectively local mean value and local variance of label picture, yi∈ { 0,1 } is represented in label picture
The value of pixel.The local mean value of predicted pictures and the calculation formula of variance are as above-mentioned formula.
Using this gaussian kernel function in the predicted pictures of parted pattern and label picture pixel-by-pixel carry out sliding volume
Product, so that it may obtain the partial statistics characteristic of picture.The partial statistics characteristic obtained by means of which has isotropism, has
Conducive to the further operating of subsequent step.
It is that (H, W and C are this label picture respectively to H × W × C label picture for a shape in step (3)
High, wide and number of active lanes), it is considered as C bianry images.It is linear between predicted pictures and label picture for measuring based on this
The index of correlation power are as follows:
Wherein, error e is the characterization of correlation power between two regional areas, and e is smaller, and correlation is stronger;μyAnd σyRespectively
It is the local mean value and Local standard deviation of label picture, the corresponding pixel of label y is located at the center of this regional area, and p is
The probability of parted pattern prediction, C4=0.01 is a stable factor.Total error e between two regional areas can be used to measure
The two interregional linearly related degree, total error e is smaller, is more likely to be positive correlation between two regional areas,
This also means that the structure in the two regions is very likely consistent;Otherwise it is larger to work as error e, illustrates between two regions
Structure is particularly likely that inconsistent.Therefore, error e can be considered as the measurement of the structural difference of two regional areas.
Because label picture y value range is { 0,1 }, it means that y2=y brings this result into variance calculation formula
It can obtain,Thus can further it obtain:
Wherein, ynorIt is the value after the normalization of part.If we seek ynorAbout μyDerivative, we can be brighter
Aobvious discovery, when other point values are 0 in y=1 and regional area, ynorObtain maximumWhen y=0 and regional area
In other point values be 1 when, ynorObtain minimumThe distribution of predicted pictures p is often not so good as the distributed pole of label picture y
End, the value p after normalizationnorExtreme valueWithAbsolute value size, be respectively smaller than correspondingWithAbsolute value.
The case where statistical nature of image is spatially often unstable, often has mutation.In addition, global is equal
Value and variance are invariable rotaries, and picture rotation its mean value of front and back and variance can't generate variation, this is for measuring
It is unsatisfactory for the structural similarity of two pictures.Therefore, in order to preferably capture the local detail of image, the present invention is used
Partial statistics characteristic rather than global statistics features.
In step (4), adjusts the value of the intersection entropy loss of pixel in picture and carry out used in difficult sample excavation
Formula is as follows:
fn,c=1 { en,c>βemax},
Wherein, n and c represents coordinate of the current pixel point in picture, emaxIt is the theoretical maximum of error e;When internal item
When part is true, 1 { } was equal to 1, otherwise was 0;β ∈ [0,1) it is weight factor for selecting to want abandoned sample, yn,cWith
pn,cIt is the corresponding label of current pixel point and prediction probability respectively,It is conventional sigmoid cross entropy loss function,It is the structural penalties function that can maximize correlation between prediction and label.In practice, the value of β is arranged to 0.1, this is
One empirical value.The value that error e is adjusted to the conventional intersection entropy loss of pixel in image as weight is to allow point
Model is cut in training, will more focus on those may result in predicted pictures and inconsistent pre- of label picture
In survey, enhance the consistency of predicted pictures and label picture.Herein, we have still continued to use cross entropy loss function, this is
Because logarithmic function loss in some documents, by it is experimental be proved to be one be highly suitable for deep neural network classification
The loss function of device.
While readjusting weight, loss function proposed by the invention has abandoned those with lower error amount
Sample point.This is because in the training process, the image in a batch may include millions of or even tens million of sample point.
In trained later stage, parted pattern can usually obtain a higher pixel precision value (for example, 96%) and a phase
(mean intersection-over-union, mIoU) score (for example, 78%) is combined to lower average cross.It is this existing
As the training effectiveness for showing that the easy sample classified has dominated the parted pattern for losing and making becomes low.Therefore, we will have
There is the sample of smaller structure otherness e to be considered as simple sample, and abandon them during the training period, that is to say, that these simple samples
And it is not involved in the calculating of last structural penalties functional value.Last result be exactly cause label picture y and predicted pictures p it
Between generate the difficult sample (the biggish sample of structural difference e) of low linear dependence and further more paid close attention to.This is one
It is referred to as difficult sample in a little documents and excavates (online hard example mining, OHEM) strategy.
There are also any it is worth noting that, loss function proposed by the invention is extracted local statistical nature as volume
Outer supervision message.Therefore, loss function proposed by the invention is the loss function of a region-by-region, this with it is general by
The loss function of pixel has difference substantially.And the model being trained using loss function proposed by the invention, instruction
It also will be under the supervision of the statistical nature information in part when practicing.
In step (5), the structural penalties function of difficult sample in single batch are as follows:
Wherein,It is the number of difficult sample, when the pixel for being located at Picture Coordinate (n, c)
When for difficult sample, fn,cIt is 1, conversely, it is pixel number total in picture that its value, which is 0, N, C represents the class number of object.
It adds up and the structural penalties functional value for each pixel that is averaged, total structural penalties functional value of current training batch can be obtained.
Due to when calculating structural difference, the otherness of the two-value picture in each channel of label picture predicted pictures corresponding with its is
Independently calculate.It is mutually independent that this means that the two-value picture in different channels is considered to be, the point nature in different two-value pictures
Also be independent from each other, thus calculate structural penalties function value when, we selected sigmoid operate rather than
Softmax operation, when choosing and the number of dyscalculia sample, and the model of all sample points in entire two-value picture
Enclose interior progress.
Finally, being used for the total losses function of Optimized Segmentation model are as follows:
Wherein, λ ∈ [0,1] is a weight factor, for adjusting conventional intersection entropy lossIt is damaged with structural similarity
It losesRelative importance, the value of λ is set as 0.5 in practice.Conventional intersection entropy loss can measure predicted pictures and mark
The similitude of image pixel intensities between label picture, and the structure that structural similarity loss can be measured between predicted pictures and label picture is similar
Property.In above formula, the role of intersection entropy loss pixel-by-pixel is similar to the part that intensity of illumination similarity is measured in SSIM, the present invention
The role of the structural similarity loss proposed is similar to the part that structural similarity is measured in SSIM.It is worth noting that, this
Place intersects entropy loss using sigmoid.It means that semantic segmentation problem is in the present invention, unlike most of common
Method it is the same, be considered as more classification problems of pixel in an image to consider, and be regarded as the two of multiple pixels
Classification problem, then a multi-categorizer is combined by multiple two classifiers.
Compared with prior art, the invention has the following advantages:
1, the present invention proposes structural penalties function, provides a kind of very intuitive method to measure between two images
Structural similarity;It can relatively easily be realized with the mode of convolution, and only need less additional meter during the training period
Calculate resource.Therefore method proposed by the present invention can be easily integrated in any existing segmentation framework.
2, semantic segmentation method proposed by the present invention, parted pattern are easy to trained, do not need additional inference step or
Additional network structure;By the way that experimental results demonstrate can obtain being better than base using the parted pattern of proposition method of the present invention training
The performance of quasi- algorithm and some other congenic methods.
Detailed description of the invention
Fig. 1 is general frame and flow diagram of the invention;
Fig. 2 is label picture in the embodiment of the present invention and the schematic diagram after predicted pictures normalization;
Fig. 3 is the statistical value schematic diagram of the label picture and predicted pictures after normalizing in the embodiment of the present invention;Fig. 4 is instruction
Difficult sample accounts for the schematic diagram of population sample number during practicing;
Fig. 5 is the embodiment of the present invention in the upper qualitative segmentation result of the verifying collection of PASCAL VOC 2012.
Specific embodiment
The invention will be described in further detail with reference to the accompanying drawings and examples, it should be pointed out that reality as described below
It applies example to be intended to convenient for the understanding of the present invention, and does not play any restriction effect to it.
As shown in Figure 1, it is a kind of based on the semantic segmentation method for maximizing impairment correlations function between prediction and label, it is obtaining
After the predicted pictures exported to parted pattern, predicted pictures and label picture are subjected to local normalization, are then calculated
The power of predicted pictures and label picture correlation obtains the value of structural difference, and according to the value of structural difference to original
Intersection entropy loss carry out weight adjustment, while carrying out difficult sample excavation.Then mould is updated according to obtained loss function value
Shape parameter repeats these processes until training stops.It can be obtained by the preferable image, semantic parted pattern of a performance at this time.
As shown in Fig. 2, the variation of original predicted pictures and label picture normalization front and back pixel point value is illustrated, and
Thus the variation of bring cross entropy penalty values.Before normalization, between original predicted pictures and label picture
Sigmoid cross entropy penalty values are about 2.805 or so.Wherein, central point is accounted for about by misclassification, the cross entropy penalty values of central point
57% or so of total cross entropy penalty values.And after normalization and with the values of structure similar differences to original cross entropy penalty values
After carrying out weight adjustment, the sigmoid cross entropy penalty values between the predicted pictures and label picture after normalization are about
3.060 left and right.Wherein, 91% or so of total cross entropy penalty values are accounted for about by the cross entropy penalty values of the central point of misclassification.
It can thus be seen that the loss of inconsistent point is amplified between two regional areas, parted pattern after being normalized
It will more be punished when generating inconsistent future position, thus parted pattern will be guided toward a better local convergence
Point advances.
It in Fig. 3, has recorded in a training process, the pole of the predicted pictures after label picture and normalization after normalization
The maximum value of value, minimum, mean value, intermediate value and structural difference, the Gaussian kernel size of the gaussian kernel function used are greatly
3.From figure 3, it can be seen that for normalized predicted pictures pnor,WithValue be respectively smaller than it is corresponding
Normalized label picture ynor'sWithValue.And the maximum value e of structural differencemaxValue it is obviously big
InOrValue.The mean value e of structural differencemeanWith intermediate value emedianAll close to 0, synchronization, emeanValue
It is greater than emedianValue.
In order to further analyze the strategy that the difficult sample taken in structural penalties function proposed by the invention excavates
Influence have recorded when the threshold parameter β for choosing difficult sample takes different value its corresponding difficult sample number in Fig. 4
Mesh accounts for the variation of the ratio and this ratio of population sample number in a training process.
As shown in figure 4, the ratio of difficult sample is very sensitive, the change of β for the threshold parameter β for choosing difficult sample
It is dynamic the ratio of difficult sample to be produced bigger effect, thus its selection be it is more crucial, the β numerical value used in the present invention for
0.1。
In Fig. 5, the segmentation effect of the parted pattern using inventive algorithm and using conventional method training is illustrated.It can be with
, it is evident that the segmentation result of the parted pattern using inventive algorithm training, the segmentation mould relative to conventional method training
The segmentation result of type obtains biggish promotion in visual experience effect.This qualitatively demonstrates the effective of inventive algorithm
Property.
Method proposed by the present invention is applied in concrete instance below, while being carried out pair with the method for other same types
Than to embody technical effect and superiority of the invention.
Parted pattern of the present invention is DeepLabv3 the and DeepLabv3+ semantic segmentation model in current forward position,
When the present invention will be compared using method proposed by the present invention and using conventional intersection entropy loss, the performance of parted pattern.
The present invention tests on two large size public data collection PASCAL VOC 2012 and Cityscapes.
2012 data set of PASCAL VOC is divided into three parts: training set, verifying collection and test set have 1464,1449 and 1456 respectively
Picture.The present invention, using an enhancing data set of PASCAL VOC 2012, includes 10582 figures in training
Piece.Cityscapes data set is a high-resolution data collection, and wherein the size of image is 2048 × 1046, training set,
Verifying collection and test set separately include 2975,500 and 1525 pictures.
Judging quota used in the present invention is mean intersection-over-union (mIoU) score, that is, is predicted
Divide the intersection of object and the ratio of union in picture and label picture.The present invention first tests on the verifying collection of PASCAL VOC 2012
The effect of algorithm is demonstrate,proved, the results are shown in Table 1.As shown in table 1, CE and BCE is that conventional softmax and sigmoid are handed over respectively
Entropy loss, the Gaussian kernel size of gaussian kernel function are pitched, that is, the size of the regional area used.As can be seen from the table, it adopts
There is better performance than traditional method with the parted pattern of algorithm proposed by the present invention training.This is also shown in table 1
Invent the relationship of the Gaussian kernel size of the algorithm effect and gaussian kernel function that propose.
In addition to this, the present invention equally compared proposed method and some same on the verifying collection of PASCAL VOC 2012
The performance of the method for type.Comparing result is as shown in table 2.
As shown in table 2, illustrate the promotion of the method relative datum algorithm (Base) based on GAN, there are also CRF method and
The promotion of Affinity method relative datum algorithm (CE, BCE).Compared to these methods, algorithm proposed by the invention is presented
The maximum promotion effect of relative datum algorithm.Further, since experimental setup is changed, the mIoU score in table 2 not with
Table 1 is consistent.
Table 1
Table 2
Further, the present invention equally demonstrates the validity of proposed algorithm on Cityscapes verifying collection, as a result
As shown in table 3.
Table 3
Technical solution of the present invention and beneficial effect is described in detail in embodiment described above, it should be understood that
Above is only a specific embodiment of the present invention, it is not intended to restrict the invention, it is all to be done in spirit of the invention
Any modification, supplementary, and equivalent replacement, should all be included in the protection scope of the present invention.
Claims (8)
1. a kind of based on the semantic segmentation method for maximizing impairment correlations function between prediction and label characterized by comprising
(1) real scene picture is inputted into parted pattern, obtains predicted pictures;
(2) sliding convolution is carried out in predicted pictures and label picture with a gaussian kernel function, obtain partial statistics characteristic, packet
Include local mean value and variance;
(3) according to obtained partial statistics characteristic, the linear dependence in predicted pictures and label picture between corresponding region is calculated
It is strong and weak;
(4) using the index of linear dependence power as weight, the value of the intersection entropy loss of pixel in picture and progress are adjusted
Difficult sample excavates;
(5) the structural penalties function of difficult sample in each trained batch is calculated, and is further calculated for Optimized Segmentation model
Total losses function, update parted pattern in weight parameter;
(6) repeat the above steps (1) to step (5), terminates training, and the model that training is finished after reaching default frequency of training
Carry out the application of semantic segmentation.
2. the semantic segmentation method according to claim 1 based on maximization prediction and impairment correlations function between label,
It is characterized in that, in step (2), use standard deviation for 1.5 gaussian kernel function w={ wi| i=1,2 ..., k2Obtain part
Statistical nature, wherein the partial statistics characteristic of label picture is as follows:
Wherein,μyWithThe respectively local mean value and local variance of label picture, yi∈ { 0,1 } represents mark
Sign the value of pixel in picture.
3. the semantic segmentation method according to claim 1 based on maximization prediction and impairment correlations function between label,
It is characterized in that, calculating the index of the linear dependence power in predicted pictures and label picture between corresponding region in step (3)
Are as follows:
Wherein, error e is the characterization of correlation power between two regional areas, and e is smaller, and correlation is stronger;μyAnd σyIt is mark respectively
The local mean value and Local standard deviation of picture are signed, the corresponding pixel of label y is located at the center of this regional area, μpAnd σpPoint
Not Wei predicted pictures local mean value and Local standard deviation, p be parted pattern prediction probability, C4=0.01 be a stabilization because
Son.
4. the semantic segmentation method according to claim 1 based on maximization prediction and impairment correlations function between label,
It is characterized in that, in step (4), adjusts in picture the value of the intersections entropy loss of pixel and carry out difficult sample and excavate to be used
Formula it is as follows:
fn,c=1 { en,c>βemax},
Wherein, n and c represents coordinate of the current pixel point in picture, emaxIt is the theoretical maximum of error e;When interior condition is
When true, 1 { } was equal to 1, otherwise was 0;β ∈ [0,1) it is weight factor for selecting to want abandoned sample, yn,cAnd pn,cPoint
It is not the corresponding label of current pixel point and prediction probability,It is conventional sigmoid cross entropy loss function,Being can
To maximize the structural penalties function of correlation between prediction and label.
5. the semantic segmentation method according to claim 4 based on maximization prediction and impairment correlations function between label,
It is characterized in that, the value of β is set as 0.1.
6. the semantic segmentation method according to claim 1 based on maximization prediction and impairment correlations function between label,
It is characterized in that, in step (5), the formula of the structural penalties function of difficult sample in each trained batch are as follows:
Wherein,It is the number of difficult sample, when the pixel positioned at Picture Coordinate (n, c) is difficulty
When sample, fn,cIt is 1, conversely, its value is 0;N is pixel number total in picture, and C represents the class number of object;It adds up simultaneously
Total structural penalties functional value of current training batch can be obtained in the structural penalties functional value of average each pixel.
7. the semantic segmentation method according to claim 6 based on maximization prediction and impairment correlations function between label,
It is characterized in that, the formula of total losses function are as follows:
Wherein, y and p respectively represents predicted pictures and label picture, and λ ∈ [0,1] is a weight factor, conventional for adjusting
Intersect entropy lossIt is lost with structural similarityRelative importance, conventional intersection entropy loss for measure prediction
The similitude of image pixel intensities between picture and label picture, and structural similarity loss is for measuring between predicted pictures and label picture
Structural similarity.
8. the semantic segmentation method according to claim 7 based on maximization prediction and impairment correlations function between label,
It is characterized in that, the value of λ is set as 0.5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910505928.8A CN110322445B (en) | 2019-06-12 | 2019-06-12 | Semantic segmentation method based on maximum prediction and inter-label correlation loss function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910505928.8A CN110322445B (en) | 2019-06-12 | 2019-06-12 | Semantic segmentation method based on maximum prediction and inter-label correlation loss function |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110322445A true CN110322445A (en) | 2019-10-11 |
CN110322445B CN110322445B (en) | 2021-06-22 |
Family
ID=68119517
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910505928.8A Active CN110322445B (en) | 2019-06-12 | 2019-06-12 | Semantic segmentation method based on maximum prediction and inter-label correlation loss function |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110322445B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110992365A (en) * | 2019-11-04 | 2020-04-10 | 杭州电子科技大学 | Loss function based on image semantic segmentation and design method thereof |
CN111739027A (en) * | 2020-07-24 | 2020-10-02 | 腾讯科技(深圳)有限公司 | Image processing method, device and equipment and readable storage medium |
CN111931782A (en) * | 2020-08-12 | 2020-11-13 | 中国科学院上海微***与信息技术研究所 | Semantic segmentation method, system, medium, and apparatus |
CN112215803A (en) * | 2020-09-15 | 2021-01-12 | 昆明理工大学 | Aluminum plate eddy current inspection image defect segmentation method based on improved generation countermeasure network |
CN113688915A (en) * | 2021-08-24 | 2021-11-23 | 北京玖安天下科技有限公司 | Content security-oriented difficult sample mining method and device |
CN113920079A (en) * | 2021-09-30 | 2022-01-11 | 中国科学院深圳先进技术研究院 | Difficult sample mining method, system, terminal and storage medium |
CN115222940A (en) * | 2022-07-07 | 2022-10-21 | 北京邮电大学 | Semantic segmentation method and system |
CN115797642A (en) * | 2023-02-13 | 2023-03-14 | 华东交通大学 | Self-adaptive image semantic segmentation algorithm based on consistency regularization and semi-supervision field |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101493887A (en) * | 2009-03-06 | 2009-07-29 | 北京工业大学 | Eyebrow image segmentation method based on semi-supervision learning and Hash index |
CN101826204B (en) * | 2009-03-04 | 2012-09-26 | 中国人民解放军63976部队 | Quick particle image segmentation method based on improved waterline algorithm |
CA2963132A1 (en) * | 2014-10-01 | 2016-04-07 | Lyrical Labs Video Compression Technology, LLC | Method and system for unsupervised image segmentation using a trained quality metric |
CN105957063A (en) * | 2016-04-22 | 2016-09-21 | 北京理工大学 | CT image liver segmentation method and system based on multi-scale weighting similarity measure |
CN106548478A (en) * | 2016-10-28 | 2017-03-29 | 中国科学院苏州生物医学工程技术研究所 | Active contour image partition method based on local fit image |
CN107945269A (en) * | 2017-12-26 | 2018-04-20 | 清华大学 | Complicated dynamic human body object three-dimensional rebuilding method and system based on multi-view point video |
CN109359603A (en) * | 2018-10-22 | 2019-02-19 | 东南大学 | A kind of vehicle driver's method for detecting human face based on concatenated convolutional neural network |
CN109685807A (en) * | 2018-11-16 | 2019-04-26 | 广州市番禺区中心医院(广州市番禺区人民医院、广州市番禺区心血管疾病研究所) | Lower-limb deep veins thrombus automatic division method and system based on deep learning |
CN109685802A (en) * | 2018-12-13 | 2019-04-26 | 贵州火星探索科技有限公司 | A kind of Video segmentation live preview method of low latency |
-
2019
- 2019-06-12 CN CN201910505928.8A patent/CN110322445B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101826204B (en) * | 2009-03-04 | 2012-09-26 | 中国人民解放军63976部队 | Quick particle image segmentation method based on improved waterline algorithm |
CN101493887A (en) * | 2009-03-06 | 2009-07-29 | 北京工业大学 | Eyebrow image segmentation method based on semi-supervision learning and Hash index |
CA2963132A1 (en) * | 2014-10-01 | 2016-04-07 | Lyrical Labs Video Compression Technology, LLC | Method and system for unsupervised image segmentation using a trained quality metric |
CN105957063A (en) * | 2016-04-22 | 2016-09-21 | 北京理工大学 | CT image liver segmentation method and system based on multi-scale weighting similarity measure |
CN106548478A (en) * | 2016-10-28 | 2017-03-29 | 中国科学院苏州生物医学工程技术研究所 | Active contour image partition method based on local fit image |
CN107945269A (en) * | 2017-12-26 | 2018-04-20 | 清华大学 | Complicated dynamic human body object three-dimensional rebuilding method and system based on multi-view point video |
CN109359603A (en) * | 2018-10-22 | 2019-02-19 | 东南大学 | A kind of vehicle driver's method for detecting human face based on concatenated convolutional neural network |
CN109685807A (en) * | 2018-11-16 | 2019-04-26 | 广州市番禺区中心医院(广州市番禺区人民医院、广州市番禺区心血管疾病研究所) | Lower-limb deep veins thrombus automatic division method and system based on deep learning |
CN109685802A (en) * | 2018-12-13 | 2019-04-26 | 贵州火星探索科技有限公司 | A kind of Video segmentation live preview method of low latency |
Non-Patent Citations (2)
Title |
---|
ZHOU WANG ET AL: "Image Quality Assessment: From Error Visibility to Structural Similarity", 《 IEEE TRANSACTIONS ON IMAGE PROCESSING》 * |
余玉琴 等: "基于 Gloabl-Local评估方法的U-Net图像分割", 《计算机与数字工程》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110992365B (en) * | 2019-11-04 | 2023-04-18 | 杭州电子科技大学 | Loss function based on image semantic segmentation and design method thereof |
CN110992365A (en) * | 2019-11-04 | 2020-04-10 | 杭州电子科技大学 | Loss function based on image semantic segmentation and design method thereof |
CN111739027A (en) * | 2020-07-24 | 2020-10-02 | 腾讯科技(深圳)有限公司 | Image processing method, device and equipment and readable storage medium |
CN111739027B (en) * | 2020-07-24 | 2024-04-26 | 腾讯科技(深圳)有限公司 | Image processing method, device, equipment and readable storage medium |
CN111931782A (en) * | 2020-08-12 | 2020-11-13 | 中国科学院上海微***与信息技术研究所 | Semantic segmentation method, system, medium, and apparatus |
CN111931782B (en) * | 2020-08-12 | 2024-03-01 | 中国科学院上海微***与信息技术研究所 | Semantic segmentation method, system, medium and device |
CN112215803A (en) * | 2020-09-15 | 2021-01-12 | 昆明理工大学 | Aluminum plate eddy current inspection image defect segmentation method based on improved generation countermeasure network |
CN112215803B (en) * | 2020-09-15 | 2022-07-12 | 昆明理工大学 | Aluminum plate eddy current inspection image defect segmentation method based on improved generation countermeasure network |
CN113688915B (en) * | 2021-08-24 | 2023-07-25 | 北京玖安天下科技有限公司 | Difficult sample mining method and device for content security |
CN113688915A (en) * | 2021-08-24 | 2021-11-23 | 北京玖安天下科技有限公司 | Content security-oriented difficult sample mining method and device |
CN113920079A (en) * | 2021-09-30 | 2022-01-11 | 中国科学院深圳先进技术研究院 | Difficult sample mining method, system, terminal and storage medium |
CN115222940A (en) * | 2022-07-07 | 2022-10-21 | 北京邮电大学 | Semantic segmentation method and system |
CN115797642A (en) * | 2023-02-13 | 2023-03-14 | 华东交通大学 | Self-adaptive image semantic segmentation algorithm based on consistency regularization and semi-supervision field |
Also Published As
Publication number | Publication date |
---|---|
CN110322445B (en) | 2021-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110322445A (en) | A kind of semantic segmentation method based on maximization prediction and impairment correlations function between label | |
CN107330437B (en) | Feature extraction method based on convolutional neural network target real-time detection model | |
CN102184221B (en) | Real-time video abstract generation method based on user preferences | |
Poggi et al. | Supervised segmentation of remote sensing images based on a tree-structured MRF model | |
CN108898145A (en) | A kind of image well-marked target detection method of combination deep learning | |
CN107481188A (en) | A kind of image super-resolution reconstructing method | |
CN110427839A (en) | Video object detection method based on multilayer feature fusion | |
CN106570486A (en) | Kernel correlation filtering target tracking method based on feature fusion and Bayesian classification | |
CN109993775B (en) | Single target tracking method based on characteristic compensation | |
CN109978918A (en) | A kind of trajectory track method, apparatus and storage medium | |
CN106778687A (en) | Method for viewing points detecting based on local evaluation and global optimization | |
CN108416266A (en) | A kind of video behavior method for quickly identifying extracting moving target using light stream | |
CN109671102A (en) | A kind of composite type method for tracking target based on depth characteristic fusion convolutional neural networks | |
CN106778852A (en) | A kind of picture material recognition methods for correcting erroneous judgement | |
CN110084782B (en) | Full-reference image quality evaluation method based on image significance detection | |
CN110827304B (en) | Traditional Chinese medicine tongue image positioning method and system based on deep convolution network and level set method | |
CN107730515A (en) | Panoramic picture conspicuousness detection method with eye movement model is increased based on region | |
CN109871875A (en) | A kind of building change detecting method based on deep learning | |
CN110163213A (en) | Remote sensing image segmentation method based on disparity map and multiple dimensioned depth network model | |
CN108305253A (en) | A kind of pathology full slice diagnostic method based on more multiplying power deep learnings | |
CN110222686A (en) | Object detecting method, device, computer equipment and storage medium | |
CN107833241A (en) | To real-time vision object detection method of the ambient lighting change with robustness | |
CN106991686A (en) | A kind of level set contour tracing method based on super-pixel optical flow field | |
CN104715480B (en) | A kind of object detection method based on Statistical background model | |
CN104484890A (en) | Video target tracking method based on compound sparse model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |