CN116543162A - Image segmentation method and system based on feature difference and context awareness consistency - Google Patents

Image segmentation method and system based on feature difference and context awareness consistency Download PDF

Info

Publication number
CN116543162A
CN116543162A CN202310533868.7A CN202310533868A CN116543162A CN 116543162 A CN116543162 A CN 116543162A CN 202310533868 A CN202310533868 A CN 202310533868A CN 116543162 A CN116543162 A CN 116543162A
Authority
CN
China
Prior art keywords
image
image segmentation
segmentation
consistency
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310533868.7A
Other languages
Chinese (zh)
Other versions
CN116543162B (en
Inventor
袭肖明
王逸潇
张光
陈关忠
宁阳
刘新锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Jianzhu University
Original Assignee
Shandong Jianzhu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Jianzhu University filed Critical Shandong Jianzhu University
Priority to CN202310533868.7A priority Critical patent/CN116543162B/en
Publication of CN116543162A publication Critical patent/CN116543162A/en
Application granted granted Critical
Publication of CN116543162B publication Critical patent/CN116543162B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image segmentation method and system based on consistency of feature difference and context perception, comprising the following steps: acquiring a marked image and a non-marked image; training the constructed image segmentation model through the marked image and the unmarked image to obtain a trained image segmentation model; the image segmentation model is built by adopting a teacher-student model, the student model comprises an image segmentation network branch and a context consistency branch, the context consistency branch comprises a preprocessing module, an encoder, a decoder and a Projector module, when the unmarked image trains the image segmentation model, the bidirectional contrast loss of the characteristic difference value output by the Projector module is calculated, and the output result of the decoder is supervised through the output result of the image segmentation network branch; and dividing the image to be cut through the trained image division model to obtain an image division result. And the segmentation accuracy of the model is improved.

Description

Image segmentation method and system based on feature difference and context awareness consistency
Technical Field
The invention relates to the technical field of image segmentation, in particular to an image segmentation method and system based on consistency of feature difference and context perception.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Image semantic segmentation belongs to a very important task in computer vision, and is mainly applied to target extraction in mobile scenes, content-based image retrieval, focus diagnosis of medical images and the like. With the development of the deep learning technology, the image segmentation technology based on the deep learning has been greatly advanced in recent years. In a fully supervised environment, the improvement of the accuracy of the image segmentation model is usually dependent on a large amount of marker data. And the acquisition of the marking data requires a great deal of labor cost. The segmentation task requires pixel-level labeling for full-supervision training, which clearly further increases the difficulty of label data acquisition. The lack of marker data limits the improvement of segmentation model accuracy. The acquisition of unlabeled data is very easy with respect to labeled data. The semi-supervised learning technology can effectively utilize potential knowledge contained in a large amount of unlabeled data, and further improves the precision of the segmentation model.
Conventional semi-supervised learning typically uses vanilla Contrastive Loss (CL) as a contrast penalty applied to picture-level features, the basic idea of CL is to group together similar samples by computing the similarity between the samples, separate dissimilar samples, and simply align them with each other ignoring the confidence of the features. Since the feature prediction with higher confidence is generally more accurate, the feature with high confidence and the feature with low confidence are mutually moved, the distinguishing property of the feature with high confidence is reduced, an unreliable segmentation result is generated, and the segmentation precision of the model is reduced. Furthermore, existing semi-supervised models may ignore concerns about the prospects depending on the context available in the training data, resulting in poor model generalization ability.
Disclosure of Invention
In order to solve the problems, the invention provides an image segmentation method and an image segmentation system based on consistency of feature difference values and context perception, which calculate the bidirectional contrast loss of the feature difference values of foreground and background when the image segmentation model is trained through a label-free image, align features with low confidence to features with high confidence, ensure consistency in the feature space and discrimination between the classes, and improve the segmentation precision of the model.
In order to achieve the above purpose, the invention adopts the following technical scheme:
in a first aspect, an image segmentation method based on consistency of feature differences and context awareness is provided, including:
acquiring a marked image and a non-marked image;
training the constructed image segmentation model through the marked image and the unmarked image to obtain a trained image segmentation model;
the image segmentation model is constructed by adopting a teacher-student model, the student model comprises an image segmentation network branch and a context consistency branch, the context consistency branch comprises a preprocessing module, an encoder, a decoder and a Projector module, when the unmarked image is used for training the image segmentation model, the preprocessing module is used for performing foreground cutting and data enhancement on the unmarked image to obtain a cut image and a data enhanced image, the cut image and the data enhanced image are input into the encoder to obtain cut image features and data enhanced image features, the cut image features and the data enhanced image features are input into the decoder and the Projector module, the cut image segmentation result and the data enhanced image segmentation result are output through the decoder, and the cut image segmentation features and the data enhanced image segmentation features are output through the Projector module to perform feature difference bidirectional contrast loss on the cut image segmentation features and the data enhanced image segmentation features.
And dividing the image to be cut through the trained image division model to obtain an image division result.
In a second aspect, an image segmentation system based on feature difference and context-aware consistency is provided, comprising:
the image acquisition module is used for acquiring a marked image and a non-marked image;
the image segmentation model training module is used for training the constructed image segmentation model through the marked image and the unmarked image to obtain a trained image segmentation model;
the image segmentation model is constructed by adopting a teacher-student model, the student model comprises an image segmentation network branch and a context consistency branch, the context consistency branch comprises a preprocessing module, an encoder, a decoder and a Projector module, when the unmarked image is used for training the image segmentation model, the preprocessing module is used for performing foreground cutting and data enhancement on the unmarked image to obtain a cut image and a data enhanced image, the cut image and the data enhanced image are input into the encoder to obtain cut image features and data enhanced image features, the cut image features and the data enhanced image features are input into the decoder and the Projector module, the cut image segmentation result and the data enhanced image segmentation result are output through the decoder, and the cut image segmentation features and the data enhanced image segmentation features are output through the Projector module to perform feature difference bidirectional contrast loss on the cut image segmentation features and the data enhanced image segmentation features.
The image segmentation module is used for segmenting the image to be cut through the trained image segmentation model to obtain an image segmentation result.
In a third aspect, an electronic device is provided that includes a memory and a processor, and computer instructions stored on the memory and running on the processor that, when executed by the processor, perform the steps recited in the image segmentation method based on feature differences and context-aware consistency.
In a fourth aspect, a computer readable storage medium is provided for storing computer instructions that, when executed by a processor, perform the steps of a method for image segmentation based on feature differences and context-aware consistency.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention introduces a preprocessing module and a Projector module into a context perception consistency branch, carries out foreground cutting and data enhancement on a label-free image through the preprocessing module to obtain a cut image and a data enhanced image, inputs the cut image and the data enhanced image into an encoder to obtain a cut image feature and a data enhanced image feature, inputs the cut image feature and the data enhanced image feature into a decoder and the Projector module, outputs a cut image segmentation result and a data enhanced image segmentation result through the decoder, and outputs the cut image segmentation feature and the data enhanced image segmentation feature through the Projector module Calculate->Features of->The difference value between the two-way contrast loss trains the image segmentation model, and the segmentation precision of the model is improved.
2. The invention calculates the difference bidirectional contrast loss of the foreground image and the background image when the image segmentation network branches are trained, so that the features with low confidence level are aligned to the features with high confidence level, the intra-class consistency and the inter-class discriminant in the feature space are ensured, and the segmentation precision of the model is improved.
3. When the image segmentation model is trained through the unmarked images, the context perception consistency branch is used for introducing context information into the image segmentation network branch, and when the same unmarked image is processed, the cutting and data enhancement method is adopted simultaneously, so that the processed two images have different context information, consistency regularization loss is carried out on results of the two images with different contexts, the consistency of foreground and background segmentation of the same object under different contexts is ensured, the situation that the foreground and background segmentation is excessively dependent on available contexts in training data is avoided, the high-level characteristics are not easily affected by the change of environment, and the robustness is improved.
4. The invention provides a more reliable pseudo tag for the Decoder in the context consistency branch, reduces the influence of potential noise in the pseudo tag, introduces uncertainty loss in the context consistency branch, calculates self-adaptive pixel weight by using uncertainty estimation, gives higher weight to the pixels with higher confidence, focuses on the reliable pixels, reduces the influence of tag noise, better utilizes the context information to enhance the segmentation performance, and further improves the segmentation precision of the model.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application.
Fig. 1 is a diagram showing the structure of an image segmentation model disclosed in embodiment 1.
Detailed Description
The invention will be further described with reference to the drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
Example 1
In this embodiment, an image segmentation method based on the consistency of feature differences and context awareness is disclosed, comprising:
s1: a marked image and an unmarked image are acquired.
S2: training the constructed image segmentation model through the marked image and the unmarked image to obtain a trained image segmentation model;
the image segmentation model is built by adopting a teacher-student model, the student model comprises an image segmentation network branch and a context consistency branch, the context consistency branch comprises a preprocessing module, an encoder, a decoder and a Projector module, when the unmarked image is used for training the image segmentation model, the preprocessing module is used for performing foreground cutting and data enhancement on the unmarked image to obtain a cut image and a data enhanced image, the cut image and the data enhanced image are input into the encoder to obtain cut image characteristics and data enhanced image characteristics, the cut image characteristics and the data enhanced image characteristics are input into the decoder and the Projector module, a cut image segmentation result and a data enhanced image segmentation result are output through the decoder, and characteristic difference bidirectional contrast loss is performed on the cut image segmentation characteristics and the data enhanced image segmentation characteristics; and the output result of the decoder is supervised by using the branch output result of the image segmentation network.
As shown in fig. 1, the image segmentation model disclosed in this embodiment is constructed by using a teacher-student model, and takes an image to be segmented as input, and outputs an image segmentation result, where the student model adopts a semi-supervised segmentation model, and includes an image segmentation network branch and a context consistency branch, and the image segmentation network branch segments the input image to be identified, and outputs a graphThe image segmentation result, the context consistency branch comprises a preprocessing module, an Encoder (Encoder), a Decoder (Decoder) and a Projector module, an image to be cut enters the preprocessing module, the preprocessing module carries out foreground cutting and data enhancement on the image to be cut to obtain a cut image and a data enhanced image, the cut image and the data enhanced image are input into the Encoder to obtain a cut image feature and a data enhanced image feature, the cut image feature and the data enhanced image feature are input into the Decoder and the image feature after being used for calculating the feature difference value contrast loss Projector module, and the cut image segmentation result is output through the DecoderAnd image segmentation result after data enhancement +.>Outputting the cut image segmentation feature +.>And data enhanced image segmentation feature +.>
The use of image segmentation network branches to monitor context consistency branches, the context consistency branches providing context information ensures that image segmentation network branches are segmented under different contexts by the target with more accurate results.
The teacher model has the same structure as the image segmentation network branches in the student model, and the weights of the teacher model are updated by the weights of the image segmentation network branches in the student model when the teacher model is trained.
According to the embodiment, the marked image and the unmarked image are input into the constructed image segmentation model, the gradient descent method is adopted to train the image segmentation model, and the training is completed, so that the trained image segmentation model is obtained.
When training an image segmentation model through a marker image, segmenting the marker image through an image segmentation network branch to obtain a foreground image of the marker image; extracting the characteristics of the marked image through the encoder of the context consistency branch to obtain the characteristics of the marked image, inputting the characteristics of the marked image into a decoder, and outputting the prediction segmentation result of the marked image; calculating the cross entropy loss of the foreground image of the marker image and the real label, training the image segmentation network branch through the cross entropy loss, calculating the cross entropy loss and the Dice loss of the prediction segmentation result and the real label, and training the context consistency branch through the cross entropy loss and the Dice loss.
When the image segmentation model is trained through the unmarked images, the unmarked images are segmented through the branches of the image segmentation network, so that the foreground features of the unmarked images are obtainedAnd background features->Calculating the difference value bidirectional contrast loss of the foreground characteristic and the background characteristic of the label-free image; the preprocessing module in the context consistency branch performs foreground cutting and data enhancement on the unmarked image to obtain a cut image +.>And data enhanced image->Cut image +.>And data enhanced image->In the input encoder, the cut image feature and the data enhanced image feature are obtained, the cut image feature and the data enhanced image feature are input into a decoder and a Projector module, and the cut image segmentation junction is output through the decoderFruit (herba Cichorii)>And image segmentation result after data enhancement +.>Identifying foreground features in the cut image features and the data enhanced image features through a Projector module, and outputting cut image segmentation features +.>And data enhanced image segmentation feature +.>Calculate->Is a feature difference bi-directional contrast loss of (a) cut image segmentation result +.>Prediction segmentation result with unmarked image in image segmentation network branch +.>Uncertainty loss of (2) image segmentation result after data enhancement +.>Prediction segmentation result with unmarked image in image segmentation network branch +.>Uncertainty loss of (2) and cut image segmentation result +.>And the consistency regularization loss of the image segmentation result after data enhancement; loss of uncertainty by bi-directional contrast of differences between foreground features and background features of a label-free imageLoss of identity regularization and +.>Training a student model with a difference bi-directional contrast loss of no-label image +.>The method comprises the following steps: and (5) performing 1*1 convolution operation on foreground features of the unmarked images extracted by the image segmentation network branches, and converting the foreground features into probabilities through softmax to obtain labels.
In this embodiment, the context consistency branch is trained by the marked image first, and then the context consistency branch is trained by the unmarked image.
Specifically, the marked image is input after data enhancement, and is input into an image segmentation network branch, and the marked image after data enhancement is subjected to image segmentation through the image segmentation network branch to obtain a foreground image of the marked imageAnd a background image; foreground image of the marker image +.>And performing cross entropy loss calculation with the real label y, and supervising the segmentation result by using the real label y of the marked image to ensure the accuracy of the segmentation result.
Foreground image of marker imageCross entropy loss with real tag y->The method comprises the following steps:
taking the unmarked image as the input of an image segmentation network branch, and carrying out image segmentation on the unmarked image through the image segmentation network branch to obtain the foreground characteristic of the unmarked imageAnd background features->Computing foreground features->And background features->Characteristic difference two-way contrast loss-> Is a foreground collection, is->D b Is a set of background features.
Will beGradient counter-propagating to->
Wherein N is the number of foreground pixels, h, w is the length and width of the input region, r calculates cosine similarity between features,to indicate the function, judge->Whether or not the confidence level of (2) is greater than +.> Indicating that the function is +.>On the basis of +.> The difference between them sets a threshold value y, ensuring the characteristic +.>The confidence is high enough to avoid destroying low confidence features due to insufficient confidence.
Will beGradient counter-propagating to->
To obtain better segmentation effect, the characteristic difference value is compared with the loss in two directionsCombining the two features to obtain the two loss calculations.
The total loss L of the image segmentation network branches class The method comprises the following steps:
where B represents the training batch size, lambda controls the unsupervised loss weight.
Inputting the marked image into a context consistency branch, extracting the characteristics of the marked image by an encoder of the context consistency branch to obtain the characteristics of the marked image, inputting the characteristics of the marked image into a decoder, and outputting the prediction segmentation result of the marked imageCalculating predictive segmentation result->Cross entropy loss with real tag y->And the Dice loss L seg
The unlabeled image is input into the context consistency branch. The preprocessing module in the context consistency branch performs foreground cutting and data enhancement on the unmarked image to obtain a cut imageAnd data enhanced image->Will cut the imageAnd data enhanced image->In the input encoder, obtaining the cut image feature and the data enhanced image feature, inputting the cut image feature and the data enhanced image feature into a decoder and a Projector module, and outputting a cut image segmentation result ∈of the decoder>And image segmentation result after data enhancement +.>And outputs the decoder as pseudo tag +.>The Projector module identifies foreground features in the cut image features and the data enhanced image features and outputs cut image cut features and data enhanced image cut features>The corresponding features are respectively Regarded as positive sample pair, background feature +.>And foreground feature->Calculating the feature difference bidirectional contrast loss of the cutting image cutting feature and the image cutting feature after data enhancement as a negative sample pair, inputting the unprocessed unmarked image into an image segmentation network branch at the same time, and obtaining the prediction segmentation result of the unmarked image according to the image segmentation branch result +.>By->And-> And->Between correction of pseudo-supervisor by uncertainty estimation, introduction of adaptive pixel weights for pseudo-supervisor losses, and by reducing +.>And->The cosine distance between predictions is used to minimize pseudo-supervision losses.
Output of results by result supervision Decoder (Decoder) using split image network branchesWhen using uncertainty estimation, an adaptive pixel weight is introduced for this pseudo-supervision loss, where pixels with higher confidence have higher weights and pixels with lower confidence have lower weights.
Specifically, the preprocessing module comprises two branches, wherein one branch identifies the foreground of the unmarked image, after the foreground of the unmarked image is identified, the image is cut by utilizing a matrix frame and coordinates with preset size, the original image is cut into a required cut image according to the determined rectangular boundary, the cut image can be realized by using an image processing library OpenCV, and the cut image is directly displayed in a program; and the other branch, the whole unmarked image is subjected to data enhancement, and the image after data enhancement is obtained.
After the foreground is determined, the image is cut by utilizing a matrix frame and coordinates with preset size, and the original image is cut into required parts according to the determined rectangular boundary, so that the image processing library OpenCV can be used for realizing. The cut portions are displayed directly in the program.
The unlabeled image mainly comprises the following loss functions in the network training process:
(1) Introducing uncertainty function correctionPrediction segmentation result with unmarked image in image segmentation network branch +.>Pseudo-supervision loss-> And->Pseudo-supervision loss->
(2) Introducing consistency regularization using cosine distance as pseudo-supervision loss after data enhancement
(3) Linear combinations of uncertainty loss and consistency regularization loss constitute pseudo-supervised losses
Pseudo-supervision technique is implemented byAnd->And the monitoring and the loss function are carried out, so that the correctness of the model for the unmarked image prediction is ensured:
wherein Z is the logits of the input model marker image, namely the output of the model before softmax is activated, sigma is softmax operation, T is temperature, and the output of Z is regulated, but under the condition that the model predicts false labels, the false labels cannot give correct guidance, KL divergence is added, namely the relative entropy considers probability distribution difference, and an uncertainty estimation correction false supervision method is introduced. By losing the self-adaptive pixel weight, the weight of the pixel is distributed along with the confidence, the influence of the label noise is reduced, the robustness of the model is enhanced,pseudo tag output by decoderPrediction segmentation result with label-free image +.>The uncertainty penalty of (2) is:
is intended to realizeAnd->The predicted results are as close as possible, introducing consistency regularization to minimize +.>Inconsistent with each other, using cosine distance measure +.>The similarity between the two can be minimized by reducing the cosine distance between the two, so the present embodiment can minimize the pseudo-supervision loss by calculating the cosine distance of the cut image segmentation result and the data-enhanced image segmentation result +.>To calculate the consistency regularization loss +.>
Pseudo-supervision lossFor a linear combination of uncertainty loss and consistency regularization:
learning different feature representations between classes by differential bi-directional contrast learning, and enhancing cut image segmentation features and foreground features in the image segmentation featuresConsidered as positive sample pair, pair characteristics->Performing two-way contrast loss of characteristic difference values, and adding a background characteristic in the image segmentation characteristics after data enhancement>And foreground feature->When the image segmentation result is used as a negative sample pair, the characteristic difference value of the image segmentation result after the data enhancement is cut into two-way contrast loss +.>The method comprises the following steps:
D n as a set of background pixels,is->
Therefore, in this embodiment, the loss function L of the unlabeled image in the context consistency branch unsup The method comprises the following steps:
wherein alpha and beta are adjustable super parameters.
Context consistency branch total loss L context The method comprises the following steps:
when the image segmentation model is trained, the consistency regularization loss of the segmentation result output by the student model and the segmentation result output by the teacher model is also calculated, and the image segmentation model is trained through the consistency regularization loss.
The teacher model has the same structure as the image segmentation network branches in the student model, and the weights of the teacher model are updated by the weights of the image segmentation network branches in the student model when the teacher model is trained.
Specifically, the student model is updated by a random gradient descent (SGD) method, and the weights of the teacher model are updated by an Exponential Moving Average (EMA) method of student weights, wherein one part of the weights come from historical teacher model weights, and the other part of the weights come from current-step student model weights. The teacher uses EMA weights for the student model:
θ' t =αθ' t-1 +(1-α)θ t
definition of θ 'in training step t' t Is the EMA value of the continuous weight θ, where α is the smoothing coefficient ultra-smoothing parameter.
Defining a consistency regularization loss J as a desired distance between student model predictions and teacher model predictions:
J(θ)=E x,η',η [||f(x,θ' t ,η')-f(x,θ,η)|| 2 ]
η, η 'is different noise of the student model and the teacher model, each iteration updates a parameter, θ represents a parameter of the student network, θ' t Parameters representing teacher network
During training, the parameters of the student model are updated by adopting an SGD method, and the parameter updating formula of the student model is as follows:
θ is a student network model parameter, and μ is a learning rate.
The total loss L of the image segmentation model is:
L=L class +L context +J(θ)
in this embodiment, the volume data is enhanced by performing random cropping, random brightness, contrast and saturation adjustment, and adding gaussian noise to the image.
S3: and dividing the image to be cut through the trained image division model to obtain an image division result.
The image to be segmented can be a medical image, an automatic driving road condition recognition image, an object recognition image in robot vision or a scene understanding image.
Inputting the acquired image to be cut into a trained image segmentation model, and outputting an image segmentation result through an image segmentation network branch of the image segmentation model.
According to the image segmentation method disclosed by the embodiment, for processing the problem of moving the high-confidence feature to the low-confidence feature, a new feature difference value bidirectional comparison loss is introduced to be applied to foreground and background features of a segmentation result, so that the high-confidence feature cannot be affected by the low-confidence feature error. The confidence coefficient calculating method based on the feature difference value judges the confidence coefficient and carries out bidirectional comparison loss on the basis of enough confidence coefficient difference, so that the features with low confidence coefficient are aligned to the features with high confidence coefficient, the intra-class consistency and the inter-class discriminant in the feature space are ensured, and the segmentation precision of the model is improved.
The branches of the context perception consistency are branches of the segmented images, and the context information is introduced into the branches of the segmented images, and the cutting and data enhancement method is adopted simultaneously when the same image is processed, so that the processed two images have different context information, consistency regularization loss is carried out on results of the two images with different contexts, the consistency of foreground and background segmentation under different contexts of the same target is ensured, the situation that the foreground and background segmentation is excessively dependent on available contexts in training data is avoided, the high-level features are not easy to be influenced by environmental influence changes, and the robustness is improved. In order to enable a Decoder in a context branch to provide a more reliable pseudo tag, the influence of potential noise in the pseudo tag is reduced, and uncertainty loss is introduced in the context branch. Uncertainty loss uses uncertainty estimation to calculate self-adaptive pixel weight, and higher weight is given to pixels with higher confidence, so that the credible pixels are focused on, the influence of label noise is reduced, and the segmentation performance is enhanced by better utilizing context information.
Example 2
In this embodiment, an image segmentation system based on feature difference and context-aware consistency is disclosed, comprising:
the image acquisition module is used for acquiring a marked image and a non-marked image;
the image segmentation model training module is used for training the constructed image segmentation model through the marked image and the unmarked image to obtain a trained image segmentation model;
the image segmentation model is built by adopting a teacher-student model, the student model comprises an image segmentation network branch and a context consistency branch, the context consistency branch comprises a preprocessing module, an encoder, a decoder and a Projector module, when the unmarked image is used for training the image segmentation model, the preprocessing module is used for performing foreground cutting and data enhancement on the unmarked image to obtain a cut image and a data enhanced image, the cut image and the data enhanced image are input into the encoder to obtain cut image characteristics and data enhanced image characteristics, the cut image characteristics and the data enhanced image characteristics are input into the decoder and the Projector module, a cut image segmentation result and a data enhanced image segmentation result are output through the decoder, and characteristic difference bidirectional contrast loss is performed on the cut image segmentation characteristics and the data enhanced image segmentation characteristics;
the image segmentation module is used for segmenting the image to be cut through the trained image segmentation model to obtain an image segmentation result.
Example 3
In this embodiment, an electronic device is disclosed that includes a memory and a processor, and computer instructions stored on the memory and running on the processor that, when executed by the processor, perform the steps described in the image segmentation method based on feature differences and context awareness consistency disclosed in embodiment 1.
Example 4
In this embodiment, a computer readable storage medium is disclosed for storing computer instructions that, when executed by a processor, perform the steps of the image segmentation method disclosed in embodiment 1 based on feature difference and context aware consistency.
Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.

Claims (10)

1. The image segmentation method based on the consistency of the characteristic difference and the context perception is characterized by comprising the following steps:
acquiring a marked image and a non-marked image;
training the constructed image segmentation model through a marked image and a unmarked image to obtain a trained image segmentation model, wherein the image segmentation model is constructed by adopting a teacher-student model, the student model comprises an image segmentation network branch and a context consistency branch, the context consistency branch comprises a preprocessing module, an encoder, a decoder and a Projector module, when the unmarked image trains the image segmentation model, the preprocessing module carries out foreground cutting and data enhancement on the unmarked image to obtain a cut image and a data enhanced image, the cut image and the data enhanced image are input into an encoder to obtain cut image features and data enhanced image features, the cut image features and the data enhanced image features are input into a decoder and the Projector module, the cut image segmentation results and the data enhanced image segmentation results are output through the decoder, and the cut image segmentation features and the data enhanced image segmentation features are output through the Projector module, so that feature difference value bidirectional contrast loss is carried out on the cut image segmentation features and the data enhanced image segmentation features;
and dividing the image to be cut through the trained image division model to obtain an image division result.
2. The image segmentation method based on the consistency of feature difference and context perception according to claim 1, wherein when the image segmentation model is trained by the unmarked image, the unmarked image is segmented by the image segmentation network branch to obtain a foreground image and a background image of the unmarked image; calculating the difference bidirectional contrast loss of a foreground image and a background image of the unmarked image, calculating the feature difference bidirectional contrast loss of a cut image segmentation feature and a data enhanced image segmentation feature, the uncertainty loss of a cut image segmentation result and a prediction segmentation result of the unmarked image in an image segmentation network branch, the uncertainty loss of a prediction segmentation result of the data enhanced image segmentation result and the unmarked image in the image segmentation network branch, and the consistency regularization loss of the cut image segmentation result and the data enhanced image segmentation result; and training the student model through the difference value bidirectional contrast loss, uncertainty loss, consistency regularization loss and characteristic difference value bidirectional contrast loss of the image segmentation characteristic and the image segmentation characteristic after data enhancement of the foreground image and the background image of the label-free image.
3. The image segmentation method based on feature difference and context aware consistency of claim 2, wherein consistency regularization penalty is calculated by calculating cosine distances of cut image segmentation results and data enhanced image segmentation results.
4. The image segmentation method based on the consistency of feature differences and context awareness according to claim 1, wherein when training an image segmentation model by using the marker image, the marker image is segmented by an image segmentation network branch to obtain a foreground image of the marker image; extracting the characteristics of the marked image through the encoder of the context consistency branch to obtain the characteristics of the marked image, inputting the characteristics of the marked image into a decoder, and outputting the prediction segmentation result of the marked image; calculating the cross entropy loss of the foreground image of the marker image and the real label, training the image segmentation network branch through the cross entropy loss, calculating the cross entropy loss and the Dice loss of the prediction segmentation result and the real label, and training the context consistency branch through the cross entropy loss and the Dice loss.
5. The method for image segmentation based on feature differences and context awareness consistency of claim 4, wherein the context consistency branch is trained by a labeled image before the context consistency branch is trained by an unlabeled image.
6. The image segmentation method based on feature difference and context awareness consistency according to claim 1, wherein the teacher network has the same branch structure as the image segmentation network in the student network, and the weights of the teacher network are updated by the weights of the branches of the image segmentation network in the student network when training the teacher network.
7. The image segmentation method based on feature difference and context awareness consistency of claim 6, further comprising calculating a consistency regularization loss of segmentation results output by the student network and segmentation results output by the teacher network when training the image segmentation model, and training the image segmentation model through the consistency regularization loss.
8. An image segmentation system based on consistency of feature differences and context awareness, comprising:
the image acquisition module is used for acquiring a marked image and a non-marked image;
the image segmentation model training module is used for training the constructed image segmentation model through the marked image and the unmarked image to obtain a trained image segmentation model;
the image segmentation model is built by adopting a teacher-student model, the student model comprises an image segmentation network branch and a context consistency branch, the context consistency branch comprises a preprocessing module, an encoder, a decoder and a Projector module, when the unmarked image is used for training the image segmentation model, the preprocessing module is used for performing foreground cutting and data enhancement on the unmarked image to obtain a cut image and a data enhanced image, the cut image and the data enhanced image are input into the encoder to obtain cut image characteristics and data enhanced image characteristics, the cut image characteristics and the data enhanced image characteristics are input into the decoder and the Projector module, a cut image segmentation result and a data enhanced image segmentation result are output through the decoder, and characteristic difference bidirectional contrast loss is performed on the cut image segmentation characteristics and the data enhanced image segmentation characteristics;
the image segmentation module is used for segmenting the image to be cut through the trained image segmentation model to obtain an image segmentation result.
9. An electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of the method of image segmentation based on feature differences and context-aware consistency of any of claims 1-7.
10. A computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the feature difference and context aware consistency based image segmentation method according to any of claims 1-7.
CN202310533868.7A 2023-05-09 2023-05-09 Image segmentation method and system based on feature difference and context awareness consistency Active CN116543162B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310533868.7A CN116543162B (en) 2023-05-09 2023-05-09 Image segmentation method and system based on feature difference and context awareness consistency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310533868.7A CN116543162B (en) 2023-05-09 2023-05-09 Image segmentation method and system based on feature difference and context awareness consistency

Publications (2)

Publication Number Publication Date
CN116543162A true CN116543162A (en) 2023-08-04
CN116543162B CN116543162B (en) 2024-07-12

Family

ID=87450250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310533868.7A Active CN116543162B (en) 2023-05-09 2023-05-09 Image segmentation method and system based on feature difference and context awareness consistency

Country Status (1)

Country Link
CN (1) CN116543162B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333874A (en) * 2023-10-27 2024-01-02 江苏新希望科技有限公司 Image segmentation method, system, storage medium and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507343A (en) * 2019-01-30 2020-08-07 广州市百果园信息技术有限公司 Training of semantic segmentation network and image processing method and device thereof
CN113129309A (en) * 2021-03-04 2021-07-16 同济大学 Medical image semi-supervised segmentation system based on object context consistency constraint
WO2022041307A1 (en) * 2020-08-31 2022-03-03 温州医科大学 Method and system for constructing semi-supervised image segmentation framework
CN115797639A (en) * 2023-02-07 2023-03-14 中山大学 Text supervision semantic segmentation algorithm based on multi-view consistency
CN116030044A (en) * 2023-03-01 2023-04-28 北京工业大学 Boundary-aware semi-supervised medical image segmentation method
CN116051574A (en) * 2022-12-28 2023-05-02 河南大学 Semi-supervised segmentation model construction and image analysis method, device and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507343A (en) * 2019-01-30 2020-08-07 广州市百果园信息技术有限公司 Training of semantic segmentation network and image processing method and device thereof
WO2022041307A1 (en) * 2020-08-31 2022-03-03 温州医科大学 Method and system for constructing semi-supervised image segmentation framework
CN113129309A (en) * 2021-03-04 2021-07-16 同济大学 Medical image semi-supervised segmentation system based on object context consistency constraint
CN116051574A (en) * 2022-12-28 2023-05-02 河南大学 Semi-supervised segmentation model construction and image analysis method, device and system
CN115797639A (en) * 2023-02-07 2023-03-14 中山大学 Text supervision semantic segmentation algorithm based on multi-view consistency
CN116030044A (en) * 2023-03-01 2023-04-28 北京工业大学 Boundary-aware semi-supervised medical image segmentation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
侯彪等: "基于自适应窗口固定及传播的多尺度纹理图像分割", 电子学报, 15 July 2009 (2009-07-15) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333874A (en) * 2023-10-27 2024-01-02 江苏新希望科技有限公司 Image segmentation method, system, storage medium and device

Also Published As

Publication number Publication date
CN116543162B (en) 2024-07-12

Similar Documents

Publication Publication Date Title
CN109753913B (en) Multi-mode video semantic segmentation method with high calculation efficiency
CN112734775B (en) Image labeling, image semantic segmentation and model training methods and devices
CN111368846B (en) Road ponding identification method based on boundary semantic segmentation
CN111696110B (en) Scene segmentation method and system
CN113313166B (en) Ship target automatic labeling method based on feature consistency learning
CN112801182B (en) RGBT target tracking method based on difficult sample perception
CN114266794A (en) Pathological section image cancer region segmentation system based on full convolution neural network
CN113065551A (en) Method for performing image segmentation using a deep neural network model
CN114943888B (en) Sea surface small target detection method based on multi-scale information fusion
CN116664840A (en) Semantic segmentation method, device and equipment based on mutual relationship knowledge distillation
US20240185590A1 (en) Method for training object detection model, object detection method and apparatus
CN114529894A (en) Rapid scene text detection method fusing hole convolution
CN116597339A (en) Video target segmentation method based on mask guide semi-dense contrast learning
CN115690704B (en) LG-CenterNet model-based complex road scene target detection method and device
CN116543162B (en) Image segmentation method and system based on feature difference and context awareness consistency
CN116681961A (en) Weak supervision target detection method based on semi-supervision method and noise processing
CN116433909A (en) Similarity weighted multi-teacher network model-based semi-supervised image semantic segmentation method
CN115861886A (en) Fan blade segmentation method and device based on video segment feature matching
CN116958919A (en) Target detection method, target detection device, computer readable medium and electronic equipment
CN115359091A (en) Armor plate detection tracking method for mobile robot
CN114973202A (en) Traffic scene obstacle detection method based on semantic segmentation
CN117456191B (en) Semantic segmentation method based on three-branch network structure under complex environment
CN117636072B (en) Image classification method and system based on difficulty perception data enhancement and label correction
CN116452634A (en) One-stage multi-target tracking method and system using global response graph
CN116912283A (en) Video segmentation method and device based on cascade residual convolution neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant