CN113516130A - Entropy minimization-based semi-supervised image semantic segmentation method - Google Patents

Entropy minimization-based semi-supervised image semantic segmentation method Download PDF

Info

Publication number
CN113516130A
CN113516130A CN202110811842.5A CN202110811842A CN113516130A CN 113516130 A CN113516130 A CN 113516130A CN 202110811842 A CN202110811842 A CN 202110811842A CN 113516130 A CN113516130 A CN 113516130A
Authority
CN
China
Prior art keywords
strategy
loss
consistency
segmentation
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110811842.5A
Other languages
Chinese (zh)
Other versions
CN113516130B (en
Inventor
李佐勇
吴嘉炜
樊好义
张晓青
赖桃桃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Minjiang University
Original Assignee
Minjiang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Minjiang University filed Critical Minjiang University
Priority to CN202110811842.5A priority Critical patent/CN113516130B/en
Publication of CN113516130A publication Critical patent/CN113516130A/en
Application granted granted Critical
Publication of CN113516130B publication Critical patent/CN113516130B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a semi-supervised image semantic segmentation method based on entropy minimization. Firstly, a feature gradient mapping regularization strategy (FGMR) is proposed, which uses gradient mapping of a low-layer feature map in an encoder to enhance the encoding capability of the encoder on a deep-layer feature map; then, a self-adaptive sharpening strategy is provided, and the decision boundary of the unmarked data is kept in a low-density area; and in order to further reduce the influence of noise, a low-confidence consistency strategy is provided to ensure the consistency of classification and segmentation. Numerous experiments confirm the superiority of the algorithm of the present invention over existing methods.

Description

Entropy minimization-based semi-supervised image semantic segmentation method
Technical Field
The invention belongs to the technical field of computer vision, is used for semi-supervised image semantic segmentation, plays a vital role in image segmentation under the scene with only a small amount of annotation data and a large amount of non-annotation data, and particularly relates to a semi-supervised image semantic segmentation method based on entropy minimization.
Background
In recent years, with the development of deep supervised learning, various computer vision tasks have been remarkably developed. However, training a deep neural network requires a large amount of label data, which is often time consuming and expensive to acquire. Especially in the semantic segmentation task, a large number of pixel-level labels are needed, and the labeling cost is 15 times and 60 times of that of the region-level labels and the image-level labels respectively. The increase in medical image segmentation cost is more significant due to the need for labeling by a professional physician. Therefore, there is an increasing interest in weakly supervised segmentation methods and semi-supervised segmentation methods.
Semi-supervised image semantic segmentation assumes a large amount of unlabeled data and limited labeled data in the same distribution. Currently, mainstream semi-supervised segmentation methods can be classified into methods based on generation of a countermeasure network (GAN) and methods based on consistency training. GAN-based approaches extend the generic GAN framework to pixel-level prediction, trying to spoof the discriminator with false unmarked data. The consistency training method expects the output of the network to be smooth under different perturbations. These methods have proven their effectiveness in semi-supervised image semantic segmentation, but they also have some limitations. For example, GAN-based methods utilize unlabeled data, but require careful design of specific network structures and are difficult to train. The consistency training method needs to forward propagate each disturbed data, causing extra calculation, and the disturbance implicitly enhances the data, which is unfair for the performance comparison between the two for the fully-supervised network model without data enhancement.
Disclosure of Invention
The invention aims to overcome the defects and provides a semi-supervised image semantic segmentation method based on entropy minimization, which firstly provides a feature gradient mapping regularization strategy (FGMR), and the FGMR uses the gradient mapping of low-layer feature maps in an encoder to enhance the encoding capacity of the encoder on deep-layer feature maps; then, a self-adaptive sharpening strategy is provided, and the decision boundary of the unmarked data is kept in a low-density area; in order to further reduce the influence of noise, a low-confidence consistency strategy is provided to ensure the consistency of classification and segmentation; the method provided by the invention can obviously improve the semantic segmentation performance of the semi-supervised image basically without network structure modification and extra calculation cost.
In order to achieve the purpose, the technical scheme of the invention is as follows: a semi-supervised image semantic segmentation method based on entropy minimization firstly provides a feature gradient mapping regularization strategy FGMR, which uses gradient mapping of a low-layer feature map in an encoder to enhance the encoding capability of the encoder on a deep-layer feature map; then, a self-adaptive sharpening strategy is provided, and the decision boundary of the unmarked data is kept in a low-density area; and in order to further reduce the influence of noise, a low-confidence consistency strategy is provided to ensure the consistency of classification and segmentation.
In an embodiment of the present invention, the method changes a semi-supervised image semantic segmentation network structure into: assuming the input image size is H × W and the number of classes is C, the network output is changed to the mean value μ of the segmentation results∈RH×W×CSum variance
Figure BDA0003168602120000021
Also the mean value mu of the classification result is output at the last layer of the encoderc∈RCSum variance
Figure BDA0003168602120000022
For the improvement of the loss function of the network, first, the loss function of the network contains a supervised loss term and an unsupervised loss term:
L=Ls+λLu
wherein L issIs a supervisory loss term, LuUnsupervised loss term, λ is regulatory supervisionSupervising the over-parameters of balance between the loss items and the unsupervised loss items;
for tagged data xl∈RH×W×3Corresponding segmentation label ys∈RH×W×CThe category label is yc∈RC(ii) a X is to belThe data is sent to a network to obtain corresponding mean value and variance, and segmentation prediction z is obtained by sampling according to a heavy parameter techniquesAnd classification prediction zcThen using cross entropy loss to monitor y separatelysAnd the segmentation result zsAnd y iscAnd the classification result zc(ii) a For tagged data, the loss term is defined as:
Figure BDA0003168602120000023
wherein H (·,) is a cross-entropy loss function, α (·) is an activation function of the last layer;
for the unlabeled data, firstly, enhancing the edge gradient value of the feature map obtained by the encoder by using a feature gradient map regularization strategy FGMR; then, searching a noise sample by using the variance as an accidental uncertainty, and guiding an adaptive sharpening strategy to obtain a pseudo label of the unlabeled data, wherein the pseudo label which may bring noise is used for supervising the unlabeled data; even though the accidental uncertainty performance filters some noise samples, the noise samples generated by the pseudo-tag still probably affect the performance of the network; to solve this problem, the low confidence class in the classification result is further used to suppress the segmentation prediction of the corresponding class to maintain the consistency of the class; the adaptive sharpening loss and the similar consistency loss can resist each other, so that the decision boundary is in a low-density area, and a steady prediction result is obtained; the unsupervised loss function is defined as:
Figure BDA0003168602120000024
wherein the content of the first and second substances,
Figure BDA0003168602120000025
and
Figure BDA0003168602120000026
the loss terms are respectively of a feature gradient map regularization strategy FGMR, an adaptive sharpening strategy adaptive sharp and a class consistency strategy class consistency.
In an embodiment of the present invention, the feature gradient map regularization strategy FGMR specifically implements the following formula:
Figure BDA0003168602120000027
wherein the content of the first and second substances,
Figure BDA0003168602120000028
is a gradient operator, SeIs an encoder that partitions the network of the network,
Figure BDA0003168602120000029
the training phase is set to not perform back propagation.
In an embodiment of the present invention, the adaptive sharpening policy adaptive sharpen is specifically implemented as follows:
first, a common sharpening strategy is defined as:
Figure BDA0003168602120000031
wherein T is a hyper-parameter; when T → 0, the sharpening (p, T) result will approach the dirac distribution; since the sharpened result is targeted for unmarked data, lowering T may encourage the model to produce low entropy predictions; however, the setting of T needs to be carefully designed, and especially in the image segmentation task, it is not reasonable to set the same value of T for all samples;
therefore, an adaptive sharpening strategy adaptive sharpen is proposed, which filters noise samples with the variance as the contingent uncertainty and adaptively adjusts the T value of each sample according to the predicted confidence level, so that the lower the confidence level, the higher the sharpening degree of the sample, i.e.:
Figure BDA0003168602120000032
Figure BDA0003168602120000033
wherein the content of the first and second substances,
Figure BDA0003168602120000034
equations (1) and (2) can adaptively generate pseudo-labels for each sample, and then use the mean square error loss to optimize the unlabeled data, i.e.:
Figure BDA0003168602120000035
in an embodiment of the present invention, a specific implementation formula of the class consistency policy is as follows:
Figure BDA0003168602120000036
wherein p isc=softmax(μc),ps=softmax(μs) And β is a threshold for determining a low confidence coherency boundary.
In one embodiment of the present invention, the
Figure BDA0003168602120000037
Compared with the prior art, the invention has the following beneficial effects: the invention provides a novel entropy minimization-based semi-supervised image semantic segmentation method. Entropy minimization has proven to be an effective semi-supervised method of implementing clustering hypotheses, where decision boundaries should be located in low density regions. Specifically, the invention provides a feature gradient mapping regularization strategy to expand low-entropy segmentation prediction of feature space inter-class distance. In addition, an adaptive sharpening strategy with any uncertainty and a consistency-like constraint regularization strategy are introduced to reduce the interference of noise on pseudo labels. A large number of experiments on PASCALVOC, PASCAL-Context and blood leukocyte data sets show that the method can obviously improve the semantic segmentation performance of the semi-supervised image without network structure change and extra calculation cost basically.
Drawings
FIG. 1 is a network model architecture of the present invention.
FIG. 2 is a graph showing statistics and observations of the gradient of the U-net coding layer on the leukocyte test set.
Fig. 3 is a partial segmentation result on the PASCAL VOC data set of the 1/8 labeled sample.
FIG. 4 shows the partial segmentation results on the 1/10 labeled sample blood leukocyte data set.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
The invention relates to a semi-supervised image semantic segmentation method based on entropy minimization, which firstly provides a feature gradient mapping regularization strategy FGMR, wherein the gradient mapping of a low-layer feature map in an encoder is used for enhancing the encoding capacity of the encoder on a deep-layer feature map; then, a self-adaptive sharpening strategy is provided, and the decision boundary of the unmarked data is kept in a low-density area; and in order to further reduce the influence of noise, a low-confidence consistency strategy is provided to ensure the consistency of classification and segmentation.
The following is a specific embodiment of the present invention.
1. Overview of the method
The invention only needs to slightly change the existing segmentation network and does not need to carry out careful network structure design. The network architecture of the present invention is shown in fig. 1. Assuming that the size of the input image is H × W and the number of categories is C, the specific modification is: changing the net output to the mean value mu of the segmentation results∈RH×W×CSum variance
Figure BDA0003168602120000041
Also the mean value mu of the classification result is output at the last layer of the encoderc∈RCSum variance
Figure BDA0003168602120000042
In addition to the above-described minor changes to the network, the algorithm of the present invention improves on the loss function of the network, which contains both supervised and unsupervised loss terms:
L=Ls+λLu (1)
wherein L issIs a supervisory loss term, LuAn unsupervised loss term, and λ is a hyper-parameter that adjusts the balance between the supervised and unsupervised loss terms.
For tagged data xl∈RH×W×3Corresponding segmentation label ys∈RH×W×CThe category label is yc∈RC. X is to belThe data is sent to a network to obtain corresponding mean value and variance, and segmentation prediction z is obtained by sampling according to a heavy parameter techniquesAnd classification prediction zcThen the most common cross-entropy loss is used to supervise y separatelysAnd the segmentation result zsAnd y iscAnd the classification result zc. For tagged data, the loss term is defined as:
Ls=∑H,W,CH(ys H,W,C,αs(zs H,W,C))+∑CH(yc C,αc(zc C)) (2)
where H (-) is the cross entropy loss function and α (-) is the activation function of the last layer.
For the unlabeled data, firstly, using Feature Gradient Map Regularization (FGMR) to enhance the edge gradient value of the feature map obtained by the encoder; the variance is then used as a contingent uncertainty to search for noise samples for directing adaptive sharpening to obtain pseudo-labels of the unlabeled data, where the pseudo-labels that may contribute noise are used to supervise the unlabeled data. Even though occasional indeterminate performance filters out some of the noise samples, the noise samples generated by the pseudo-tags are likely to still affect the performance of the network. To address this problem, low confidence classes in the classification results are further used to suppress segmentation prediction of the corresponding class to maintain class consistency. The adaptive sharpening loss and the consistency-like loss can be mutually confronted, so that the decision boundary is in a low-density area, and a stable prediction result is obtained. The unsupervised loss function is defined as:
Figure BDA0003168602120000051
wherein the content of the first and second substances,
Figure BDA0003168602120000052
and
Figure BDA0003168602120000053
are the penalty terms for Feature Gradient Map Regularization (FGMR), adaptive sharpening (adaptive sharpening), and class consistency (class consistency), respectively.
2. Feature gradient map regularization
As shown in fig. 2, the gradient statistics of different encoder layers describe a gradual increase in the ability of the encoder to extract edge information from lower layers to higher layers. After consistency training, the average gradient of different encoder layers is significantly enhanced. These results indicate that a good segmentation network expects to find more edge information to improve the segmentation accuracy. Inspired by these observations, a key goal of semantic segmentation is how to improve the encoder's ability to discern target edges. As shown in fig. 2(b) and fig. 2(c), the gradient information of the edge in the depth encoder is obviously enhanced after consistency training [1], which confirms that the consistency training method is effective because the encoder is made to have more edge resolution. Therefore, by combining the progressive characteristics of gradient information of different coding layers and the goal of improving the edge resolution capability, the regularization of the gradient feature map is designed as follows:
Figure BDA0003168602120000054
wherein the content of the first and second substances,
Figure BDA0003168602120000055
is a gradient operator, SeIs an encoder that partitions the network.
Figure BDA0003168602120000056
The anti-synchronous tracing is not set in the training stage.
3. Adaptive sharpening
The sharpening strategy proposed by the Mixmatch algorithm [2] is used to reduce the entropy of the label distribution, which employs a general strategy of adjusting the "temperature" of the classification distribution. The sharpening strategy is defined as follows:
Figure BDA0003168602120000057
where T is a hyperparameter. When T → 0, the sharpening (p, T) result will be close to the Dirac distribution. Since the sharpened result is targeted for unmarked data, lowering T may encourage the model to produce low entropy predictions. However, the setting of T requires careful design, and especially in image segmentation tasks, it is not reasonable to set the same value of T for all samples.
Therefore, the adaptive sharpening proposed by the present invention uses the variance predicted by the algorithm of the present invention as the contingent uncertainty to filter the noise samples, and adaptively adjusts the T value of each sample according to the predicted confidence, so that the lower the confidence, the higher the sharpening degree of the sample, that is:
Figure BDA0003168602120000058
Figure BDA0003168602120000066
wherein the content of the first and second substances,
Figure BDA0003168602120000061
equations 7 and 5 can adaptively generate pseudo-labels for each sample and then use the mean square error loss to optimize the unlabeled data, i.e.:
Figure BDA0003168602120000062
the self-adaptive sharpening provided by the invention enables the network model to pay more attention to non-noise samples and samples which are difficult to classify, and pay less attention to noise samples and samples which are easy to classify.
4. Category consistency
Since strong sharpening of hard samples may cause additional noise to the network, an additional noise smoothing strategy is required. Due to the unbalancedness of class distribution and the limitation of the number of samples, the accuracy of a prediction result with high confidence of a neural network cannot be ensured, and a segmentation result is easy to mislead; and the neural network can easily predict the correct result for the prediction result with low confidence. Therefore, it is desirable that the classification and segmentation be consistent over low confidence predictions, rather than high confidence predictions. The loss function can be expressed as:
Figure BDA0003168602120000063
wherein p isc=softmax(μc),ps=softmax(μs) Beta is a threshold value for determining a low confidence consistency boundary, and is set in the invention
Figure BDA0003168602120000064
5. Experimental data and evaluation
The PASCAL VOC, PASCAL-Context and blood leukocyte data sets are used to evaluate the performance of the algorithm of the invention. The PASCAL VOC data set consists of 21 classes (including background). The present invention performs data enhancement on a training data set. The enhanced data set contained 10582 training images and 1449 verification images. The PASCAL-Context dataset is a complete scene parsing dataset comprising 4998 training images and 5105 test images with dense semantic labels. According to the predecessor's work [6-3], the present invention uses semantic tags for the 60 most common classes, including the background class. The blood leukocyte image data set contained 3 categories, collected from a regular hospital, containing 500 training images of 256 × 256 size and 500 test images of the same size.
The invention adopts average cross-over ratio (mIoU) as an index for measuring PASCAL VOC and PASCAL-Context, and f1-score, recall ratio, accuracy and accuracy as evaluation indexes of blood leukocyte data sets.
6. Ablation study
Table 1 ablation study is illustrated by the contribution of 1/8 labeled data to each loss term in the PASCAL VOC data set
Figure BDA0003168602120000065
Figure BDA0003168602120000071
The present invention consists of three loss terms. Therefore, the effectiveness of each loss term and its combination was explored. The ablation experiment is shown in table 1, wherein CE, sharp, AS, CC, FGRM respectively refer to cross entropy, sharpening, adaptive sharpening, category consistency, and feature map gradient regularization, and all three loss terms effectively improve performance, and performance is further improved after feature gradient map regularization is added.
7. Qualitative and quantitative comparison
Table 2 shows the results of the evaluation on the PASCAL VOC and PASCAL-Context data sets. The performance of the algorithm of the present invention improved by 2.4% to 7.7% over the baseline method (deep bv2) when using unlabeled samples for different data segmentations. The method of Hung et al and the s4GAN method are representative of semi-supervised image semantic segmentation methods in recent two years. Under the same experimental setup, the algorithm of the present invention achieved the best results on the PASCAL VOC dataset of 1/3, 1/8, and 1/20 labeled samples and the PASCAL-Context dataset of 1/3 and 1/8 labeled samples. Figure 3 shows the qualitative results of the PASCAL VOC data set using 1/8 labeled samples.
TABLE 2 comparison of the results of the partitioning of different methods on the PASCAVLOC and PASCAL-Context datasets
Figure BDA0003168602120000072
Figure BDA0003168602120000073
To further prove that the algorithm of the invention has good generality. The 1/10-labeled samples were tested for leukocyte data sets without data enhancement. The data in table 3 show that the algorithm of the present invention improves the F1 score by 2.23%, the recall rate by 1.67%, the accuracy rate by 2.46%, and the accuracy rate by 0.95% on the basis of the baseline (Unet) method. The comparison result with the most advanced semi-supervised medical semantic segmentation method at present shows that the algorithm of the invention achieves the best segmentation effect with the minimum cost. Fig. 4 shows the partial segmentation results of the leukocyte data set of the 1/10 labeled sample, and the algorithm of the present invention can effectively segment the cytoplasm for the leukocyte image with cytoplasm close to the background.
TABLE 3 comparison of leukocyte data set semi-supervised segmentation Performance Using 1/10-labeled samples
Figure BDA0003168602120000081
8. Spatial complexity comparison
TABLE 4 comparison of spatial complexity on PASCAL VOC data sets
Figure BDA0003168602120000082
As shown by the comparison of spatial complexity on the PASCAL VOC data set in table 4, the algorithm of the present invention adds only 1.16M of additional parameter compared to the parameter of the baseline (depeplav 2), whereas the Huang et al method and the Mittal et al method add 2.78M of additional parameter. Compared with the comparison algorithm, the algorithm of the invention has less than half of the additional parameters of other methods.
Reference documents:
[1]Chen S,Bortsova G,Juárez A G U,et al.Multi-task attention-based semi-supervised learning for medical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Springer,Cham,2019:457-465.
[2]Berthelot D,Carlini N,Goodfellow I,et al.Mixmatch:A holistic approach to semi-supervised learning[J].arXiv preprint arXiv:1905.02249,2019.
[3]Sudhanshu Mittal,Maxim Tatarchenko,and Thomas Brox,“Semi-supervised semantic segmentation with high-and low-level consistency,”IEEE Transactions on Pattern Analysis and Machine Intelligence,2019.
[4]Wei Chih Hung,Yi Hsuan Tsai,Yan Ting Liou,Yen Yu Lin,and Ming Hsuan Yang,“Adversarial learning for semi-supervised semantic segmentation,”in 29th British Machine Vision Conference,BMVC 2018,2019.
[5]Liang-Chieh Chen,George Papandreou,Iasonas Kokkinos,Kevin Murphy,and Alan L Yuille,“Deeplab:Semantic im-age segmentation with deep convolutional nets,atrous convo-lution,and fully connected crfs,”IEEE Transactions on Pattern Analysis and Machine Intelligence,vol.40,no.4,pp.834–848,2017.
[6]Olaf Ronneberger,Philipp Fischer,and Thomas Brox,“U-net:Convolutional networks for biomedical image segmentation,”in International Conference on Medical image computing and computer-assisted intervention.Springer,2015,pp.234–241.
[7]Shuai Chen,Gerda Bortsova,Antonio Garc′ia-Uceda Jua′rez,Gijs van Tulder,and Marleen de Bruijne,“Multi-task attention-based semi-supervised learning for medical image segmenta-tion,”in International Conference on Medical Image Comput-ing and Computer-Assisted Intervention.Springer,2019,pp.457–465.
[8]Lequan Yu,Shujun Wang,Xiaomeng Li,Chi-Wing Fu,and Pheng-Ann Heng,“Uncertainty-aware self-ensembling modelfor semi-supervised 3d left atrium segmentation,”in In-ternational Conference on Medical Image Computing and Computer-Assisted Intervention.Springer,2019,pp.605–613.。
the above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims (6)

1. A semi-supervised image semantic segmentation method based on entropy minimization is characterized by firstly providing a feature gradient mapping regularization strategy FGMR, which uses gradient mapping of a low-layer feature map in an encoder to enhance the encoding capability of the encoder on a deep-layer feature map; then, a self-adaptive sharpening strategy is provided, and the decision boundary of the unmarked data is kept in a low-density area; and in order to further reduce the influence of noise, a low-confidence consistency strategy is provided to ensure the consistency of classification and segmentation.
2. The entropy minimization-based semi-supervised image semantic segmentation method according to claim 1, wherein the modification of the semi-supervised image semantic segmentation network structure by the method is as follows: assuming the input image size is H × W and the number of classes is C, the network output is changed to the mean value μ of the segmentation results∈RH×W×CSum variance
Figure FDA0003168602110000011
Also the mean value mu of the classification result is output at the last layer of the encoderc∈RCSum variance
Figure FDA0003168602110000012
For the improvement of the loss function of the network, first, the loss function of the network contains a supervised loss term and an unsupervised loss term:
L=Ls+λLu
wherein L issIs a supervisory loss term, LuAn unsupervised loss term, λ is a hyper-parameter that adjusts the balance between the supervised loss term and the unsupervised loss term;
for tagged data xl∈RH×W×3Corresponding segmentation label ys∈RH×W×CThe category label is yc∈RC(ii) a X is to belThe data is sent to a network to obtain corresponding mean value and variance, and segmentation prediction z is obtained by sampling according to a heavy parameter techniquesAnd classification prediction zcThen using cross entropy loss to monitor y separatelysAnd the segmentation result zsAnd y iscAnd the classification result zc(ii) a For tagged data, the loss term is defined as:
Figure FDA0003168602110000013
wherein H (·,) is a cross-entropy loss function, α (·) is an activation function of the last layer;
for the unlabeled data, firstly, enhancing the edge gradient value of the feature map obtained by the encoder by using a feature gradient map regularization strategy FGMR; then, searching a noise sample by using the variance as an accidental uncertainty, and guiding an adaptive sharpening strategy to obtain a pseudo label of the unlabeled data, wherein the pseudo label which may bring noise is used for supervising the unlabeled data; even though the accidental uncertainty performance filters some noise samples, the noise samples generated by the pseudo-tag still probably affect the performance of the network; to solve this problem, the low confidence class in the classification result is further used to suppress the segmentation prediction of the corresponding class to maintain the consistency of the class; the adaptive sharpening loss and the similar consistency loss can resist each other, so that the decision boundary is in a low-density area, and a steady prediction result is obtained; the unsupervised loss function is defined as:
Figure FDA0003168602110000014
wherein the content of the first and second substances,
Figure FDA0003168602110000015
and
Figure FDA0003168602110000016
the loss terms are respectively of a feature gradient map regularization strategy FGMR, an adaptive sharpening strategy adaptive sharp and a class consistency strategy class consistency.
3. The entropy minimization-based semi-supervised image semantic segmentation method according to claim 2, wherein the feature gradient map regularization strategy FGMR is specifically realized by the following formula:
Figure FDA0003168602110000021
wherein the content of the first and second substances,
Figure FDA0003168602110000022
is a gradient operator, SeIs an encoder that partitions the network of the network,
Figure FDA0003168602110000023
the training phase is set to not perform back propagation.
4. The entropy minimization-based semi-supervised image semantic segmentation method according to claim 2, wherein the adaptive sharpening strategy adaptive sharpen is specifically realized as follows:
first, a common sharpening strategy is defined as:
Figure FDA0003168602110000024
wherein T is a hyper-parameter; when T → 0, the sharpening (p, T) result will approach the dirac distribution; since the sharpened result is targeted for unmarked data, lowering T may encourage the model to produce low entropy predictions; however, the setting of T needs to be carefully designed, and especially in the image segmentation task, it is not reasonable to set the same value of T for all samples;
therefore, an adaptive sharpening strategy adaptive sharpen is proposed, which filters noise samples with the variance as the contingent uncertainty and adaptively adjusts the T value of each sample according to the predicted confidence level, so that the lower the confidence level, the higher the sharpening degree of the sample, i.e.:
Figure FDA0003168602110000025
Figure FDA0003168602110000026
wherein the content of the first and second substances,
Figure FDA0003168602110000027
equations (1) and (2) can adaptively generate pseudo-labels for each sample, and then use the mean square error loss to optimize the unlabeled data, i.e.:
Figure FDA0003168602110000028
5. the entropy minimization-based semi-supervised image semantic segmentation method according to claim 2, wherein the class consistency policy class consistency concrete implementation formula is as follows:
Figure FDA0003168602110000029
wherein p isc=softmax(μc),ps=softmax(μs) And β is a threshold for determining a low confidence coherency boundary.
6. An entropy minimization-based semi-supervised image semantic segmentation method according to claim 5, wherein the method is characterized in that
Figure FDA00031686021100000210
CN202110811842.5A 2021-07-19 2021-07-19 Semi-supervised image semantic segmentation method based on entropy minimization Active CN113516130B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110811842.5A CN113516130B (en) 2021-07-19 2021-07-19 Semi-supervised image semantic segmentation method based on entropy minimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110811842.5A CN113516130B (en) 2021-07-19 2021-07-19 Semi-supervised image semantic segmentation method based on entropy minimization

Publications (2)

Publication Number Publication Date
CN113516130A true CN113516130A (en) 2021-10-19
CN113516130B CN113516130B (en) 2024-01-05

Family

ID=78068524

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110811842.5A Active CN113516130B (en) 2021-07-19 2021-07-19 Semi-supervised image semantic segmentation method based on entropy minimization

Country Status (1)

Country Link
CN (1) CN113516130B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222690A (en) * 2019-04-29 2019-09-10 浙江大学 A kind of unsupervised domain adaptation semantic segmentation method multiplying loss based on maximum two
CN110837836A (en) * 2019-11-05 2020-02-25 中国科学技术大学 Semi-supervised semantic segmentation method based on maximized confidence
CN112036335A (en) * 2020-09-03 2020-12-04 南京农业大学 Deconvolution-guided semi-supervised plant leaf disease identification and segmentation method
CN113128620A (en) * 2021-05-11 2021-07-16 北京理工大学 Semi-supervised domain self-adaptive picture classification method based on hierarchical relationship

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222690A (en) * 2019-04-29 2019-09-10 浙江大学 A kind of unsupervised domain adaptation semantic segmentation method multiplying loss based on maximum two
CN110837836A (en) * 2019-11-05 2020-02-25 中国科学技术大学 Semi-supervised semantic segmentation method based on maximized confidence
CN112036335A (en) * 2020-09-03 2020-12-04 南京农业大学 Deconvolution-guided semi-supervised plant leaf disease identification and segmentation method
CN113128620A (en) * 2021-05-11 2021-07-16 北京理工大学 Semi-supervised domain self-adaptive picture classification method based on hierarchical relationship

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴嘉炜 等: "一种基于深度学习的两阶段图像去雾网络", 《计算机应用于软件》, vol. 37, no. 4, pages 197 - 202 *

Also Published As

Publication number Publication date
CN113516130B (en) 2024-01-05

Similar Documents

Publication Publication Date Title
Sun et al. Feature selection for IoT based on maximal information coefficient
Inbarani et al. Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis
Wu et al. Milcut: A sweeping line multiple instance learning paradigm for interactive image segmentation
Kunapuli et al. A decision-support tool for renal mass classification
Yadav et al. Algorithm and approaches to handle large Data-A Survey
Wen et al. CF-SIS: Semantic-instance segmentation of 3D point clouds by context fusion with self-attention
Yin et al. Region search based on hybrid convolutional neural network in optical remote sensing images
Saraswathi et al. Survey on image segmentation via clustering
Zhu et al. A GAN-based hybrid sampling method for imbalanced customer classification
Zhang et al. Saliency detection via local structure propagation
Zhan et al. Auto-csc: a transfer learning based automatic cell segmentation and count framework
Chen et al. A saliency map fusion method based on weighted DS evidence theory
Katkar et al. A novel approach for medical image segmentation using PCA and K-means clustering
Wu et al. Semi-supervised semantic segmentation via entropy minimization
Gaur et al. Superpixel embedding network
Zhao et al. NEC: A nested equivalence class-based dependency calculation approach for fast feature selection using rough set theory
Li et al. Fast semi-supervised self-training algorithm based on data editing
Ding et al. Coevolutionary fuzzy attribute order reduction with complete attribute-value space tree
Ma et al. Gaussian mixture model-based target feature extraction and visualization
CN113516130A (en) Entropy minimization-based semi-supervised image semantic segmentation method
Cai et al. Rule‐Enhanced Transfer Function Generation for Medical Volume Visualization
Xia et al. Spatial–temporal correlations learning and action-background jointed attention for weakly-supervised temporal action localization
Kowalski et al. Methods of collective intelligence in exploratory data analysis: A research survey
Alzoubi et al. Artificial intelligence techniques for neuropathological diagnostics and research
Chen et al. Research on multi-source heterogeneous big data fusion method based on feature level

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant