CN110322445A - A kind of semantic segmentation method based on maximization prediction and impairment correlations function between label - Google Patents

A kind of semantic segmentation method based on maximization prediction and impairment correlations function between label Download PDF

Info

Publication number
CN110322445A
CN110322445A CN201910505928.8A CN201910505928A CN110322445A CN 110322445 A CN110322445 A CN 110322445A CN 201910505928 A CN201910505928 A CN 201910505928A CN 110322445 A CN110322445 A CN 110322445A
Authority
CN
China
Prior art keywords
label
picture
prediction
function
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910505928.8A
Other languages
Chinese (zh)
Other versions
CN110322445B (en
Inventor
赵帅
蔡登�
武伯熹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201910505928.8A priority Critical patent/CN110322445B/en
Publication of CN110322445A publication Critical patent/CN110322445A/en
Application granted granted Critical
Publication of CN110322445B publication Critical patent/CN110322445B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of based on the semantic segmentation method for maximizing impairment correlations function between prediction and label, comprising: (1) real scene picture is inputted parted pattern, obtain predicted pictures;(2) sliding convolution is carried out in predicted pictures and label picture with a gaussian kernel function, obtain partial statistics characteristic;(3) according to obtained partial statistics characteristic, the linear dependence calculated in predicted pictures and label picture between corresponding region is strong and weak;(4) it using the index of linear dependence power as weight, adjusts the value of the intersection entropy loss of pixel in picture and carries out difficult sample excavation;(5) weight parameter in parted pattern is updated according to the penalty values of acquisition;(6) it steps be repeated alternatively until that training terminates, and carry out the application of semantic segmentation.Using the present invention, parted pattern can be made in the training process, more paying close attention to those leads to the point of low correlation between prediction and label, to promote the image segmentation of parted pattern.

Description

A kind of semantic segmentation based on maximization prediction and impairment correlations function between label Method
Technical field
The invention belongs to the image, semantics in computer vision to divide field, more particularly, to one kind based on maximization prediction The semantic segmentation method of impairment correlations function between label.
Background technique
Semantic segmentation is a basic problem of computer vision field, in unmanned, medical imaging analysis, geographical letter Scene is had a wide range of applications in the fields such as breath system, robot.In practice, the semantic segmentation of image is usually considered to be image More classification problems at midpoint, target are each pixels distributed to set semantic label in image.In recent years, with volume The development of product neural network and the proposition of the various parted patterns with stronger learning ability, semantic segmentation problem achieve very big Progress.Under normal circumstances, these models are trained and optimize by minimizing the average Classification Loss of pixel.Most Common semantic segmentation loss function is softmax cross entropy loss function:
Wherein, N is the number of pixel in picture, and C is the class number of object to be sorted, and y ∈ { 0,1 } is class label, The true classification of pixel is represent, p ∈ [0,1] is the probability of parted pattern prediction, and p is usually operated by softmax and provided.From Above formula can be seen that intersection entropy loss pixel-by-pixel for the point in image as mutually independent sample, and by all the points Total losses of the average cross entropy loss as model prediction result.However, point in image is there are very strong dependence, these The structural information of the under cover object of dependence between points.Since loss function pixel-by-pixel has ignored between points Relationship, when the visual signature of prospect is fainter or when pixel belongs to the object with smaller space structure, by pixel-by-pixel Loss function supervise trained semantic segmentation model segmentation effect it is usually not satisfactory.
In order to using the structural information of object included in image, the 26th neural information processing systems in 2012 into Open up the article " Efficient on conference Conference on Neural Information Processing Systems Inference in Fully Connected CRFs with Gaussian Edge Potentials " propose it is a kind of high The full condition of contact random field (Conditional Random Field, CRF) of effect come be fitted in image between points Relationship, and drive has the prediction result of the point of similar visual appearance more consistent in true picture.But in CRF quilt When as a post-processing step, it usually has time-consuming iteration reasoning process and changes to visual appearance sensitive.
In seminar Conference of the 30th neural information processing systems progress conference in 2016 about confrontation study On on Neural Information Processing Systems Workshop on Adversarial Training Article " Semantic Segmentation using Adversarial Networks " is proposed with confrontation learning network (GAN) Thought train parted pattern, judge the predicted pictures and label picture of parted pattern with an additional arbiter network Whether there is high-level structural integrity.However, GAN is generally difficult to train, and more memory is needed in the training stage Come while storing the generator network and arbiter network of deep layer.
In European Computer vision international conference European Conference on Computer in 2018 Article " Adaptive Affinity Fields for Semantic Segmentation " on Vision proposes one kind It is associated with neighborhood and loses (Affinity Field Loss) function, this loss function belongs to the neighbour of same category object to those The prediction for occupying point applies a convergent power, so that the prediction of these points tends to similar;To the neighbours for being not belonging to same type objects The prediction of point applies the power of a dispersion, so that the prediction of these points tends to dissimilar.It is possible thereby to increase the similar point of neighbours Prediction similarity and neighbours inhomogeneity point dissimilar degree, reach preferable segmentation effect.However this method is being counted When calculating the value of loss function, needs to save neighbours' point to matrix, generally require and be several times as much as calculating needed for loss function value originally Memory headroom.
Summary of the invention
Based on the deficiencies of the prior art, the invention proposes one kind based on maximization prediction and impairment correlations letter between label Several semantic segmentation methods, the correlation maximization between the predicted pictures for exporting parted pattern and label picture, to make two Person reaches higher structural similarity, improves the segmentation effect of parted pattern.
A kind of semantic segmentation method based on maximization prediction and impairment correlations function between label, comprising:
(1) real scene picture is inputted into parted pattern, obtains predicted pictures;
(2) sliding convolution is carried out in predicted pictures and label picture with a gaussian kernel function, it is special to obtain partial statistics Sign, mean value and variance including part;
(3) according to obtained partial statistics characteristic, the linear phase in predicted pictures and label picture between corresponding region is calculated Closing property is strong and weak;
(4) using the index of linear dependence power as weight, the value of intersection entropy loss of pixel in picture is adjusted simultaneously Difficult sample is carried out to excavate;
(5) the structural penalties function of difficult sample in each trained batch is calculated, and is further calculated for Optimized Segmentation The total losses function of model updates the weight parameter in parted pattern;
(6) repeat the above steps (1) to step (5), terminates training after reaching default frequency of training, and training is finished The application of model progress semantic segmentation.
For two pictures or picture block x and y, the common form of structural similarity index S SIM is as follows:
Wherein, 3 parts are the measurement of intensity of illumination similarity, the measurement of contrast similarity and structural similarity respectively Measurement, μx、σxAnd σxyIt is the covariance of the mean value of x, the variance of x and x and y, C respectively1、C2And C3It is each for stablizing The positive number of component, their value are minimum.As constraint C3=C2When/2, and it can further obtain another reduced form of SSIM. As can be seen that the key that SSIM can measure picture structure similitude is its third part in from the equations above, and this Part is actually the Pearson correlation coefficient between variable x and y:
However, SSIM is not appropriate for being used directly to the loss function as semantic segmentation model, because in semantic segmentation In context, SSIM is not a convex function, therefore it is not easy to optimize, and model may can't converge on a part Minimum point.Based on the above analysis, the invention proposes the maximization predicted pictures and label picture phase that are suitable for semantic segmentation The structural penalties function of closing property.
In step (2), use a standard deviation for 1.5 gaussian kernel function w={ wi| i=1,2 ..., k2(the value of weight 1 is normalized to,) estimate partial statistics characteristic:
Wherein, μyWithThe respectively local mean value and local variance of label picture, yi∈ { 0,1 } is represented in label picture The value of pixel.The local mean value of predicted pictures and the calculation formula of variance are as above-mentioned formula.
Using this gaussian kernel function in the predicted pictures of parted pattern and label picture pixel-by-pixel carry out sliding volume Product, so that it may obtain the partial statistics characteristic of picture.The partial statistics characteristic obtained by means of which has isotropism, has Conducive to the further operating of subsequent step.
It is that (H, W and C are this label picture respectively to H × W × C label picture for a shape in step (3) High, wide and number of active lanes), it is considered as C bianry images.It is linear between predicted pictures and label picture for measuring based on this The index of correlation power are as follows:
Wherein, error e is the characterization of correlation power between two regional areas, and e is smaller, and correlation is stronger;μyAnd σyRespectively It is the local mean value and Local standard deviation of label picture, the corresponding pixel of label y is located at the center of this regional area, and p is The probability of parted pattern prediction, C4=0.01 is a stable factor.Total error e between two regional areas can be used to measure The two interregional linearly related degree, total error e is smaller, is more likely to be positive correlation between two regional areas, This also means that the structure in the two regions is very likely consistent;Otherwise it is larger to work as error e, illustrates between two regions Structure is particularly likely that inconsistent.Therefore, error e can be considered as the measurement of the structural difference of two regional areas.
Because label picture y value range is { 0,1 }, it means that y2=y brings this result into variance calculation formula It can obtain,Thus can further it obtain:
Wherein, ynorIt is the value after the normalization of part.If we seek ynorAbout μyDerivative, we can be brighter Aobvious discovery, when other point values are 0 in y=1 and regional area, ynorObtain maximumWhen y=0 and regional area In other point values be 1 when, ynorObtain minimumThe distribution of predicted pictures p is often not so good as the distributed pole of label picture y End, the value p after normalizationnorExtreme valueWithAbsolute value size, be respectively smaller than correspondingWithAbsolute value.
The case where statistical nature of image is spatially often unstable, often has mutation.In addition, global is equal Value and variance are invariable rotaries, and picture rotation its mean value of front and back and variance can't generate variation, this is for measuring It is unsatisfactory for the structural similarity of two pictures.Therefore, in order to preferably capture the local detail of image, the present invention is used Partial statistics characteristic rather than global statistics features.
In step (4), adjusts the value of the intersection entropy loss of pixel in picture and carry out used in difficult sample excavation Formula is as follows:
fn,c=1 { en,c>βemax},
Wherein, n and c represents coordinate of the current pixel point in picture, emaxIt is the theoretical maximum of error e;When internal item When part is true, 1 { } was equal to 1, otherwise was 0;β ∈ [0,1) it is weight factor for selecting to want abandoned sample, yn,cWith pn,cIt is the corresponding label of current pixel point and prediction probability respectively,It is conventional sigmoid cross entropy loss function,It is the structural penalties function that can maximize correlation between prediction and label.In practice, the value of β is arranged to 0.1, this is One empirical value.The value that error e is adjusted to the conventional intersection entropy loss of pixel in image as weight is to allow point Model is cut in training, will more focus on those may result in predicted pictures and inconsistent pre- of label picture In survey, enhance the consistency of predicted pictures and label picture.Herein, we have still continued to use cross entropy loss function, this is Because logarithmic function loss in some documents, by it is experimental be proved to be one be highly suitable for deep neural network classification The loss function of device.
While readjusting weight, loss function proposed by the invention has abandoned those with lower error amount Sample point.This is because in the training process, the image in a batch may include millions of or even tens million of sample point. In trained later stage, parted pattern can usually obtain a higher pixel precision value (for example, 96%) and a phase (mean intersection-over-union, mIoU) score (for example, 78%) is combined to lower average cross.It is this existing As the training effectiveness for showing that the easy sample classified has dominated the parted pattern for losing and making becomes low.Therefore, we will have There is the sample of smaller structure otherness e to be considered as simple sample, and abandon them during the training period, that is to say, that these simple samples And it is not involved in the calculating of last structural penalties functional value.Last result be exactly cause label picture y and predicted pictures p it Between generate the difficult sample (the biggish sample of structural difference e) of low linear dependence and further more paid close attention to.This is one It is referred to as difficult sample in a little documents and excavates (online hard example mining, OHEM) strategy.
There are also any it is worth noting that, loss function proposed by the invention is extracted local statistical nature as volume Outer supervision message.Therefore, loss function proposed by the invention is the loss function of a region-by-region, this with it is general by The loss function of pixel has difference substantially.And the model being trained using loss function proposed by the invention, instruction It also will be under the supervision of the statistical nature information in part when practicing.
In step (5), the structural penalties function of difficult sample in single batch are as follows:
Wherein,It is the number of difficult sample, when the pixel for being located at Picture Coordinate (n, c) When for difficult sample, fn,cIt is 1, conversely, it is pixel number total in picture that its value, which is 0, N, C represents the class number of object. It adds up and the structural penalties functional value for each pixel that is averaged, total structural penalties functional value of current training batch can be obtained. Due to when calculating structural difference, the otherness of the two-value picture in each channel of label picture predicted pictures corresponding with its is Independently calculate.It is mutually independent that this means that the two-value picture in different channels is considered to be, the point nature in different two-value pictures Also be independent from each other, thus calculate structural penalties function value when, we selected sigmoid operate rather than Softmax operation, when choosing and the number of dyscalculia sample, and the model of all sample points in entire two-value picture Enclose interior progress.
Finally, being used for the total losses function of Optimized Segmentation model are as follows:
Wherein, λ ∈ [0,1] is a weight factor, for adjusting conventional intersection entropy lossIt is damaged with structural similarity It losesRelative importance, the value of λ is set as 0.5 in practice.Conventional intersection entropy loss can measure predicted pictures and mark The similitude of image pixel intensities between label picture, and the structure that structural similarity loss can be measured between predicted pictures and label picture is similar Property.In above formula, the role of intersection entropy loss pixel-by-pixel is similar to the part that intensity of illumination similarity is measured in SSIM, the present invention The role of the structural similarity loss proposed is similar to the part that structural similarity is measured in SSIM.It is worth noting that, this Place intersects entropy loss using sigmoid.It means that semantic segmentation problem is in the present invention, unlike most of common Method it is the same, be considered as more classification problems of pixel in an image to consider, and be regarded as the two of multiple pixels Classification problem, then a multi-categorizer is combined by multiple two classifiers.
Compared with prior art, the invention has the following advantages:
1, the present invention proposes structural penalties function, provides a kind of very intuitive method to measure between two images Structural similarity;It can relatively easily be realized with the mode of convolution, and only need less additional meter during the training period Calculate resource.Therefore method proposed by the present invention can be easily integrated in any existing segmentation framework.
2, semantic segmentation method proposed by the present invention, parted pattern are easy to trained, do not need additional inference step or Additional network structure;By the way that experimental results demonstrate can obtain being better than base using the parted pattern of proposition method of the present invention training The performance of quasi- algorithm and some other congenic methods.
Detailed description of the invention
Fig. 1 is general frame and flow diagram of the invention;
Fig. 2 is label picture in the embodiment of the present invention and the schematic diagram after predicted pictures normalization;
Fig. 3 is the statistical value schematic diagram of the label picture and predicted pictures after normalizing in the embodiment of the present invention;Fig. 4 is instruction Difficult sample accounts for the schematic diagram of population sample number during practicing;
Fig. 5 is the embodiment of the present invention in the upper qualitative segmentation result of the verifying collection of PASCAL VOC 2012.
Specific embodiment
The invention will be described in further detail with reference to the accompanying drawings and examples, it should be pointed out that reality as described below It applies example to be intended to convenient for the understanding of the present invention, and does not play any restriction effect to it.
As shown in Figure 1, it is a kind of based on the semantic segmentation method for maximizing impairment correlations function between prediction and label, it is obtaining After the predicted pictures exported to parted pattern, predicted pictures and label picture are subjected to local normalization, are then calculated The power of predicted pictures and label picture correlation obtains the value of structural difference, and according to the value of structural difference to original Intersection entropy loss carry out weight adjustment, while carrying out difficult sample excavation.Then mould is updated according to obtained loss function value Shape parameter repeats these processes until training stops.It can be obtained by the preferable image, semantic parted pattern of a performance at this time.
As shown in Fig. 2, the variation of original predicted pictures and label picture normalization front and back pixel point value is illustrated, and Thus the variation of bring cross entropy penalty values.Before normalization, between original predicted pictures and label picture Sigmoid cross entropy penalty values are about 2.805 or so.Wherein, central point is accounted for about by misclassification, the cross entropy penalty values of central point 57% or so of total cross entropy penalty values.And after normalization and with the values of structure similar differences to original cross entropy penalty values After carrying out weight adjustment, the sigmoid cross entropy penalty values between the predicted pictures and label picture after normalization are about 3.060 left and right.Wherein, 91% or so of total cross entropy penalty values are accounted for about by the cross entropy penalty values of the central point of misclassification. It can thus be seen that the loss of inconsistent point is amplified between two regional areas, parted pattern after being normalized It will more be punished when generating inconsistent future position, thus parted pattern will be guided toward a better local convergence Point advances.
It in Fig. 3, has recorded in a training process, the pole of the predicted pictures after label picture and normalization after normalization The maximum value of value, minimum, mean value, intermediate value and structural difference, the Gaussian kernel size of the gaussian kernel function used are greatly 3.From figure 3, it can be seen that for normalized predicted pictures pnor,WithValue be respectively smaller than it is corresponding Normalized label picture ynor'sWithValue.And the maximum value e of structural differencemaxValue it is obviously big InOrValue.The mean value e of structural differencemeanWith intermediate value emedianAll close to 0, synchronization, emeanValue It is greater than emedianValue.
In order to further analyze the strategy that the difficult sample taken in structural penalties function proposed by the invention excavates Influence have recorded when the threshold parameter β for choosing difficult sample takes different value its corresponding difficult sample number in Fig. 4 Mesh accounts for the variation of the ratio and this ratio of population sample number in a training process.
As shown in figure 4, the ratio of difficult sample is very sensitive, the change of β for the threshold parameter β for choosing difficult sample It is dynamic the ratio of difficult sample to be produced bigger effect, thus its selection be it is more crucial, the β numerical value used in the present invention for 0.1。
In Fig. 5, the segmentation effect of the parted pattern using inventive algorithm and using conventional method training is illustrated.It can be with , it is evident that the segmentation result of the parted pattern using inventive algorithm training, the segmentation mould relative to conventional method training The segmentation result of type obtains biggish promotion in visual experience effect.This qualitatively demonstrates the effective of inventive algorithm Property.
Method proposed by the present invention is applied in concrete instance below, while being carried out pair with the method for other same types Than to embody technical effect and superiority of the invention.
Parted pattern of the present invention is DeepLabv3 the and DeepLabv3+ semantic segmentation model in current forward position, When the present invention will be compared using method proposed by the present invention and using conventional intersection entropy loss, the performance of parted pattern.
The present invention tests on two large size public data collection PASCAL VOC 2012 and Cityscapes. 2012 data set of PASCAL VOC is divided into three parts: training set, verifying collection and test set have 1464,1449 and 1456 respectively Picture.The present invention, using an enhancing data set of PASCAL VOC 2012, includes 10582 figures in training Piece.Cityscapes data set is a high-resolution data collection, and wherein the size of image is 2048 × 1046, training set, Verifying collection and test set separately include 2975,500 and 1525 pictures.
Judging quota used in the present invention is mean intersection-over-union (mIoU) score, that is, is predicted Divide the intersection of object and the ratio of union in picture and label picture.The present invention first tests on the verifying collection of PASCAL VOC 2012 The effect of algorithm is demonstrate,proved, the results are shown in Table 1.As shown in table 1, CE and BCE is that conventional softmax and sigmoid are handed over respectively Entropy loss, the Gaussian kernel size of gaussian kernel function are pitched, that is, the size of the regional area used.As can be seen from the table, it adopts There is better performance than traditional method with the parted pattern of algorithm proposed by the present invention training.This is also shown in table 1 Invent the relationship of the Gaussian kernel size of the algorithm effect and gaussian kernel function that propose.
In addition to this, the present invention equally compared proposed method and some same on the verifying collection of PASCAL VOC 2012 The performance of the method for type.Comparing result is as shown in table 2.
As shown in table 2, illustrate the promotion of the method relative datum algorithm (Base) based on GAN, there are also CRF method and The promotion of Affinity method relative datum algorithm (CE, BCE).Compared to these methods, algorithm proposed by the invention is presented The maximum promotion effect of relative datum algorithm.Further, since experimental setup is changed, the mIoU score in table 2 not with Table 1 is consistent.
Table 1
Table 2
Further, the present invention equally demonstrates the validity of proposed algorithm on Cityscapes verifying collection, as a result As shown in table 3.
Table 3
Technical solution of the present invention and beneficial effect is described in detail in embodiment described above, it should be understood that Above is only a specific embodiment of the present invention, it is not intended to restrict the invention, it is all to be done in spirit of the invention Any modification, supplementary, and equivalent replacement, should all be included in the protection scope of the present invention.

Claims (8)

1. a kind of based on the semantic segmentation method for maximizing impairment correlations function between prediction and label characterized by comprising
(1) real scene picture is inputted into parted pattern, obtains predicted pictures;
(2) sliding convolution is carried out in predicted pictures and label picture with a gaussian kernel function, obtain partial statistics characteristic, packet Include local mean value and variance;
(3) according to obtained partial statistics characteristic, the linear dependence in predicted pictures and label picture between corresponding region is calculated It is strong and weak;
(4) using the index of linear dependence power as weight, the value of the intersection entropy loss of pixel in picture and progress are adjusted Difficult sample excavates;
(5) the structural penalties function of difficult sample in each trained batch is calculated, and is further calculated for Optimized Segmentation model Total losses function, update parted pattern in weight parameter;
(6) repeat the above steps (1) to step (5), terminates training, and the model that training is finished after reaching default frequency of training Carry out the application of semantic segmentation.
2. the semantic segmentation method according to claim 1 based on maximization prediction and impairment correlations function between label, It is characterized in that, in step (2), use standard deviation for 1.5 gaussian kernel function w={ wi| i=1,2 ..., k2Obtain part Statistical nature, wherein the partial statistics characteristic of label picture is as follows:
Wherein,μyWithThe respectively local mean value and local variance of label picture, yi∈ { 0,1 } represents mark Sign the value of pixel in picture.
3. the semantic segmentation method according to claim 1 based on maximization prediction and impairment correlations function between label, It is characterized in that, calculating the index of the linear dependence power in predicted pictures and label picture between corresponding region in step (3) Are as follows:
Wherein, error e is the characterization of correlation power between two regional areas, and e is smaller, and correlation is stronger;μyAnd σyIt is mark respectively The local mean value and Local standard deviation of picture are signed, the corresponding pixel of label y is located at the center of this regional area, μpAnd σpPoint Not Wei predicted pictures local mean value and Local standard deviation, p be parted pattern prediction probability, C4=0.01 be a stabilization because Son.
4. the semantic segmentation method according to claim 1 based on maximization prediction and impairment correlations function between label, It is characterized in that, in step (4), adjusts in picture the value of the intersections entropy loss of pixel and carry out difficult sample and excavate to be used Formula it is as follows:
fn,c=1 { en,c>βemax},
Wherein, n and c represents coordinate of the current pixel point in picture, emaxIt is the theoretical maximum of error e;When interior condition is When true, 1 { } was equal to 1, otherwise was 0;β ∈ [0,1) it is weight factor for selecting to want abandoned sample, yn,cAnd pn,cPoint It is not the corresponding label of current pixel point and prediction probability,It is conventional sigmoid cross entropy loss function,Being can To maximize the structural penalties function of correlation between prediction and label.
5. the semantic segmentation method according to claim 4 based on maximization prediction and impairment correlations function between label, It is characterized in that, the value of β is set as 0.1.
6. the semantic segmentation method according to claim 1 based on maximization prediction and impairment correlations function between label, It is characterized in that, in step (5), the formula of the structural penalties function of difficult sample in each trained batch are as follows:
Wherein,It is the number of difficult sample, when the pixel positioned at Picture Coordinate (n, c) is difficulty When sample, fn,cIt is 1, conversely, its value is 0;N is pixel number total in picture, and C represents the class number of object;It adds up simultaneously Total structural penalties functional value of current training batch can be obtained in the structural penalties functional value of average each pixel.
7. the semantic segmentation method according to claim 6 based on maximization prediction and impairment correlations function between label, It is characterized in that, the formula of total losses function are as follows:
Wherein, y and p respectively represents predicted pictures and label picture, and λ ∈ [0,1] is a weight factor, conventional for adjusting Intersect entropy lossIt is lost with structural similarityRelative importance, conventional intersection entropy loss for measure prediction The similitude of image pixel intensities between picture and label picture, and structural similarity loss is for measuring between predicted pictures and label picture Structural similarity.
8. the semantic segmentation method according to claim 7 based on maximization prediction and impairment correlations function between label, It is characterized in that, the value of λ is set as 0.5.
CN201910505928.8A 2019-06-12 2019-06-12 Semantic segmentation method based on maximum prediction and inter-label correlation loss function Active CN110322445B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910505928.8A CN110322445B (en) 2019-06-12 2019-06-12 Semantic segmentation method based on maximum prediction and inter-label correlation loss function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910505928.8A CN110322445B (en) 2019-06-12 2019-06-12 Semantic segmentation method based on maximum prediction and inter-label correlation loss function

Publications (2)

Publication Number Publication Date
CN110322445A true CN110322445A (en) 2019-10-11
CN110322445B CN110322445B (en) 2021-06-22

Family

ID=68119517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910505928.8A Active CN110322445B (en) 2019-06-12 2019-06-12 Semantic segmentation method based on maximum prediction and inter-label correlation loss function

Country Status (1)

Country Link
CN (1) CN110322445B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110992365A (en) * 2019-11-04 2020-04-10 杭州电子科技大学 Loss function based on image semantic segmentation and design method thereof
CN111739027A (en) * 2020-07-24 2020-10-02 腾讯科技(深圳)有限公司 Image processing method, device and equipment and readable storage medium
CN111931782A (en) * 2020-08-12 2020-11-13 中国科学院上海微***与信息技术研究所 Semantic segmentation method, system, medium, and apparatus
CN112215803A (en) * 2020-09-15 2021-01-12 昆明理工大学 Aluminum plate eddy current inspection image defect segmentation method based on improved generation countermeasure network
CN113688915A (en) * 2021-08-24 2021-11-23 北京玖安天下科技有限公司 Content security-oriented difficult sample mining method and device
CN113920079A (en) * 2021-09-30 2022-01-11 中国科学院深圳先进技术研究院 Difficult sample mining method, system, terminal and storage medium
CN115222940A (en) * 2022-07-07 2022-10-21 北京邮电大学 Semantic segmentation method and system
CN115797642A (en) * 2023-02-13 2023-03-14 华东交通大学 Self-adaptive image semantic segmentation algorithm based on consistency regularization and semi-supervision field

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101493887A (en) * 2009-03-06 2009-07-29 北京工业大学 Eyebrow image segmentation method based on semi-supervision learning and Hash index
CN101826204B (en) * 2009-03-04 2012-09-26 中国人民解放军63976部队 Quick particle image segmentation method based on improved waterline algorithm
CA2963132A1 (en) * 2014-10-01 2016-04-07 Lyrical Labs Video Compression Technology, LLC Method and system for unsupervised image segmentation using a trained quality metric
CN105957063A (en) * 2016-04-22 2016-09-21 北京理工大学 CT image liver segmentation method and system based on multi-scale weighting similarity measure
CN106548478A (en) * 2016-10-28 2017-03-29 中国科学院苏州生物医学工程技术研究所 Active contour image partition method based on local fit image
CN107945269A (en) * 2017-12-26 2018-04-20 清华大学 Complicated dynamic human body object three-dimensional rebuilding method and system based on multi-view point video
CN109359603A (en) * 2018-10-22 2019-02-19 东南大学 A kind of vehicle driver's method for detecting human face based on concatenated convolutional neural network
CN109685807A (en) * 2018-11-16 2019-04-26 广州市番禺区中心医院(广州市番禺区人民医院、广州市番禺区心血管疾病研究所) Lower-limb deep veins thrombus automatic division method and system based on deep learning
CN109685802A (en) * 2018-12-13 2019-04-26 贵州火星探索科技有限公司 A kind of Video segmentation live preview method of low latency

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101826204B (en) * 2009-03-04 2012-09-26 中国人民解放军63976部队 Quick particle image segmentation method based on improved waterline algorithm
CN101493887A (en) * 2009-03-06 2009-07-29 北京工业大学 Eyebrow image segmentation method based on semi-supervision learning and Hash index
CA2963132A1 (en) * 2014-10-01 2016-04-07 Lyrical Labs Video Compression Technology, LLC Method and system for unsupervised image segmentation using a trained quality metric
CN105957063A (en) * 2016-04-22 2016-09-21 北京理工大学 CT image liver segmentation method and system based on multi-scale weighting similarity measure
CN106548478A (en) * 2016-10-28 2017-03-29 中国科学院苏州生物医学工程技术研究所 Active contour image partition method based on local fit image
CN107945269A (en) * 2017-12-26 2018-04-20 清华大学 Complicated dynamic human body object three-dimensional rebuilding method and system based on multi-view point video
CN109359603A (en) * 2018-10-22 2019-02-19 东南大学 A kind of vehicle driver's method for detecting human face based on concatenated convolutional neural network
CN109685807A (en) * 2018-11-16 2019-04-26 广州市番禺区中心医院(广州市番禺区人民医院、广州市番禺区心血管疾病研究所) Lower-limb deep veins thrombus automatic division method and system based on deep learning
CN109685802A (en) * 2018-12-13 2019-04-26 贵州火星探索科技有限公司 A kind of Video segmentation live preview method of low latency

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHOU WANG ET AL: "Image Quality Assessment: From Error Visibility to Structural Similarity", 《 IEEE TRANSACTIONS ON IMAGE PROCESSING》 *
余玉琴 等: "基于 Gloabl-Local评估方法的U-Net图像分割", 《计算机与数字工程》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110992365B (en) * 2019-11-04 2023-04-18 杭州电子科技大学 Loss function based on image semantic segmentation and design method thereof
CN110992365A (en) * 2019-11-04 2020-04-10 杭州电子科技大学 Loss function based on image semantic segmentation and design method thereof
CN111739027A (en) * 2020-07-24 2020-10-02 腾讯科技(深圳)有限公司 Image processing method, device and equipment and readable storage medium
CN111739027B (en) * 2020-07-24 2024-04-26 腾讯科技(深圳)有限公司 Image processing method, device, equipment and readable storage medium
CN111931782A (en) * 2020-08-12 2020-11-13 中国科学院上海微***与信息技术研究所 Semantic segmentation method, system, medium, and apparatus
CN111931782B (en) * 2020-08-12 2024-03-01 中国科学院上海微***与信息技术研究所 Semantic segmentation method, system, medium and device
CN112215803A (en) * 2020-09-15 2021-01-12 昆明理工大学 Aluminum plate eddy current inspection image defect segmentation method based on improved generation countermeasure network
CN112215803B (en) * 2020-09-15 2022-07-12 昆明理工大学 Aluminum plate eddy current inspection image defect segmentation method based on improved generation countermeasure network
CN113688915B (en) * 2021-08-24 2023-07-25 北京玖安天下科技有限公司 Difficult sample mining method and device for content security
CN113688915A (en) * 2021-08-24 2021-11-23 北京玖安天下科技有限公司 Content security-oriented difficult sample mining method and device
CN113920079A (en) * 2021-09-30 2022-01-11 中国科学院深圳先进技术研究院 Difficult sample mining method, system, terminal and storage medium
CN115222940A (en) * 2022-07-07 2022-10-21 北京邮电大学 Semantic segmentation method and system
CN115797642A (en) * 2023-02-13 2023-03-14 华东交通大学 Self-adaptive image semantic segmentation algorithm based on consistency regularization and semi-supervision field

Also Published As

Publication number Publication date
CN110322445B (en) 2021-06-22

Similar Documents

Publication Publication Date Title
CN110322445A (en) A kind of semantic segmentation method based on maximization prediction and impairment correlations function between label
CN107330437B (en) Feature extraction method based on convolutional neural network target real-time detection model
CN102184221B (en) Real-time video abstract generation method based on user preferences
Poggi et al. Supervised segmentation of remote sensing images based on a tree-structured MRF model
CN108898145A (en) A kind of image well-marked target detection method of combination deep learning
CN107481188A (en) A kind of image super-resolution reconstructing method
CN110427839A (en) Video object detection method based on multilayer feature fusion
CN106570486A (en) Kernel correlation filtering target tracking method based on feature fusion and Bayesian classification
CN109993775B (en) Single target tracking method based on characteristic compensation
CN109978918A (en) A kind of trajectory track method, apparatus and storage medium
CN106778687A (en) Method for viewing points detecting based on local evaluation and global optimization
CN108416266A (en) A kind of video behavior method for quickly identifying extracting moving target using light stream
CN109671102A (en) A kind of composite type method for tracking target based on depth characteristic fusion convolutional neural networks
CN106778852A (en) A kind of picture material recognition methods for correcting erroneous judgement
CN110084782B (en) Full-reference image quality evaluation method based on image significance detection
CN110827304B (en) Traditional Chinese medicine tongue image positioning method and system based on deep convolution network and level set method
CN107730515A (en) Panoramic picture conspicuousness detection method with eye movement model is increased based on region
CN109871875A (en) A kind of building change detecting method based on deep learning
CN110163213A (en) Remote sensing image segmentation method based on disparity map and multiple dimensioned depth network model
CN108305253A (en) A kind of pathology full slice diagnostic method based on more multiplying power deep learnings
CN110222686A (en) Object detecting method, device, computer equipment and storage medium
CN107833241A (en) To real-time vision object detection method of the ambient lighting change with robustness
CN106991686A (en) A kind of level set contour tracing method based on super-pixel optical flow field
CN104715480B (en) A kind of object detection method based on Statistical background model
CN104484890A (en) Video target tracking method based on compound sparse model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant