WO2020238734A1 - 图像分割模型的训练方法、装置、计算机设备和存储介质 - Google Patents

图像分割模型的训练方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2020238734A1
WO2020238734A1 PCT/CN2020/091455 CN2020091455W WO2020238734A1 WO 2020238734 A1 WO2020238734 A1 WO 2020238734A1 CN 2020091455 W CN2020091455 W CN 2020091455W WO 2020238734 A1 WO2020238734 A1 WO 2020238734A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
discriminator
source domain
segmentation
segmentation result
Prior art date
Application number
PCT/CN2020/091455
Other languages
English (en)
French (fr)
Inventor
柳露艳
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP20815327.0A priority Critical patent/EP3979198A4/en
Publication of WO2020238734A1 publication Critical patent/WO2020238734A1/zh
Priority to US17/470,433 priority patent/US11961233B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/162Segmentation; Edge detection involving graph-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/174Segmentation; Edge detection involving the use of two or more images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20072Graph-based image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30016Brain
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • G06V2201/031Recognition of patterns in medical or anatomical images of internal organs

Definitions

  • the embodiments of the present application relate to the field of image recognition technology, and in particular to a training method, device, computer equipment, and storage medium of an image segmentation model.
  • Image segmentation refers to classifying each pixel in the image and marking the target area.
  • Image segmentation can be applied to medical image analysis, unmanned vehicle driving, geographic information system, underwater object detection and other fields.
  • image segmentation can be used to achieve tasks such as the positioning of tumors and other lesions, the measurement of tissue volume, and the study of anatomical structures.
  • the traditional image segmentation method relies on a large number of labeled images, and the premise of this method is that the data distribution of the training image set (ie the source domain image) and the test image set (ie the target domain image) are consistent.
  • the training image set ie the source domain image
  • the test image set ie the target domain image
  • the source domain image and the target domain image are aligned in the feature space, so that the finally trained model can be adapted to the target domain image in the feature space.
  • the image will be processed in multiple steps, resulting in the image segmentation result of the target domain image in the output space is not accurate enough.
  • an image segmentation model training method is provided.
  • a training method of an image segmentation model executed by a computer device, the method including:
  • the pre-trained image segmentation model is retrained, and so on Iterate training until convergence to obtain the completed training image segmentation model.
  • An image segmentation method executed by a computer device including:
  • the image segmentation model that has completed training passes through the first discriminator and the second discriminator, and uses adversarial learning in the output space Obtained by training the image segmentation model;
  • the first discriminator is used to reduce the difference between the predicted segmentation result of the target domain image and the predicted segmentation result of the source domain image in the process of training the image segmentation model; the second discriminator is used to In the process of training the image segmentation model, the difference between the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image is reduced.
  • a training device for an image segmentation model comprising:
  • the first training module is configured to use source domain samples to train an initial image segmentation model to obtain a pre-trained image segmentation model, where the source domain samples include source domain images and standard segmentation results of the source domain images;
  • the result extraction module is used to extract the predicted segmentation result of the source domain image and the predicted segmentation result of the target domain image through the pre-trained image segmentation model;
  • the second training module is used to train a first discriminator using the predicted segmentation result of the source domain image and the predicted segmentation result of the target domain image; wherein the first discriminator is used to discriminate the input segmentation result From the source domain or the target domain;
  • the third training module is used to train a second discriminator using the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image; wherein the second discriminator is used to discriminate the input segmentation result Is it predicted segmentation result or standard segmentation result;
  • the fourth training module is used to segment the pre-trained image according to the loss function of the pre-trained image segmentation model, the confrontation loss function of the first discriminator, and the confrontation loss function of the second discriminator
  • the model is retrained, and the iterative loop training is performed until convergence to obtain the image segmentation model that has been trained.
  • a computer device includes a processor and a memory, the memory stores at least one instruction, at least one program, a code set, or an instruction set, the at least one instruction, the at least one program, the code
  • the set or instruction set is loaded and executed by the processor to implement the training method of the image segmentation model, or implement the image segmentation method.
  • a computer-readable storage medium in which at least one instruction, at least one program, code set or instruction set is stored, and the at least one instruction, the at least one program, the code set or the instruction set is processed by Load and execute the above-mentioned image segmentation model training method or realize the above-mentioned image segmentation method.
  • a computer program product when the computer program product is executed, it is used to execute the above-mentioned image segmentation model training method or realize the above-mentioned image segmentation method.
  • Fig. 1 is a flowchart of a method for training an image segmentation model provided by an embodiment of the present application
  • Fig. 2 exemplarily shows a schematic flow chart of an image segmentation model training method
  • Fig. 3 is a flowchart of a method for training an image segmentation model provided by another embodiment of the present application.
  • Fig. 4 exemplarily shows a schematic diagram of segmentation results in different segmentation modes
  • Figure 5 shows example diagrams of brain tumor segmentation results under different segmentation methods
  • Figure 6 shows examples of the results of spinal cord gray matter segmentation under different segmentation methods
  • Fig. 7 is a block diagram of an image segmentation model training device provided by an embodiment of the present application.
  • Fig. 8 is a block diagram of an image segmentation model training device provided by an embodiment of the present application.
  • Fig. 9 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • Image segmentation refers to classifying each pixel in the image and marking the target area.
  • Image segmentation can be applied to medical image analysis, unmanned vehicle driving, geographic information system, underwater object detection and other fields.
  • image segmentation can be used to realize the positioning of tumors and other lesions, the measurement of tissue volume, and the study of anatomical structures.
  • image segmentation can be used to process the environment image after the vehicle camera or lidar acquires the environment image, detect the ground and identify the passable area, and then plan the driving path.
  • image segmentation can be used to process satellite remote sensing images after acquiring satellite remote sensing images, identify roads, rivers, crops, buildings, etc., and label each pixel in the image.
  • a domain-adaptive image segmentation model in the output space is proposed.
  • the image segmentation model pre-trained by the source domain samples the predicted segmentation results of the source domain image and the target domain image are extracted, and the predicted segmentation results of the source domain image and the target domain image are further input to the first discriminator, and the source domain image
  • the predicted segmentation result and the standard segmentation result of the source domain image are input to the second discriminator, and the pre-trained image segmentation model is cyclically trained and iterated until the model converges using the adversarial learning idea, and the trained image segmentation model is obtained.
  • the technical solution provided by this application aligns the source domain image and the target domain image in the output space, so that the trained image segmentation model can reduce the difference between the source domain image and the target domain image in the output space, and reduce the trained image
  • the segmentation model's error in segmenting the target domain further makes the segmentation result of the target domain image more accurate.
  • the execution subject of each step may be a computer device, which refers to an electronic device with data calculation, processing, and storage capabilities, such as a PC (Personal Computer) or a server.
  • a computer device which refers to an electronic device with data calculation, processing, and storage capabilities, such as a PC (Personal Computer) or a server.
  • FIG. 1 shows a flowchart of a method for training an image segmentation model provided by an embodiment of the present application.
  • the method is applied to computer equipment and can include the following steps (101-105):
  • Step 101 Use source domain samples to train an initial image segmentation model to obtain a pre-trained image segmentation model.
  • the aforementioned source domain samples include source domain images and standard segmentation results of source domain images.
  • the source domain image can be an image collected by an image collection device (such as a camera, medical equipment, lidar, etc.), or an image pre-stored locally, or an image obtained from the network. Not limited.
  • the foregoing source domain image may be an image in a picture format or a video image. In the embodiment of the present application, the format of the foregoing source domain image is not limited.
  • the target area can be an area of interest to the user, such as a character area, an animal area, a plant area or other designated areas in a landscape image; it can also be a tissue in a medical image
  • the organ area, cell area, or lesion area, etc., are not limited in the embodiment of the present application.
  • the aforementioned standard segmentation result of the source domain image refers to the source domain image accurately marking the target area, that is, the real segmentation label.
  • the standard segmentation result can be manually annotated by professionals.
  • the target area can be the lesion area in the medical image
  • the standard segmentation result of the source image is the medical image after the lesion area is accurately labeled in the medical image, which is beneficial to Clinical diagnosis and treatment and medical research.
  • the tumor area is the medical image target area.
  • the above-mentioned image segmentation model is used to segment the target area in the image input to the image segmentation model, and obtain the segmentation result corresponding to the input image.
  • the source domain samples include the source domain images and the standard segmentation results of the source domain images
  • the source domain samples can be used to train the initial image segmentation model and update the relevant parameters of the initial image segmentation model to obtain a pre-trained image segmentation model
  • the pre-trained image segmentation model is more accurate for the segmentation results of the same image.
  • the frame structure of the above image segmentation model can be CNN (Convolutional Neural Networks, convolutional neural network), DCNN, ResNet (Residual Neural Network, residual network), DenseNet (Densely Connected Convolutional Networks, dense convolutional neural network), etc. , It may also be other model structures that can be used for image segmentation, which is not limited in the embodiment of the present application.
  • Step 102 Extract the predicted segmentation result of the source domain image and the predicted segmentation result of the target domain image through the pre-trained image segmentation model.
  • the above-mentioned target domain image is an image belonging to the same type of task as the source domain image, but the image data distribution is different.
  • both the target domain image and the source domain image are used to detect the tumor area, but the target domain image and the source domain image are from different acquisition equipment, or from different hospitals or different centers, which will cause the target
  • the distribution of domain images and source domain images is quite different.
  • the target domain image is a CT (Computed Tomography, electronic computer tomography) image
  • the source domain image is an MRI (Magnetic Resonance Imaging, nuclear magnetic resonance image). Because the two different medical images focus on different information, the tumor area is The distribution in CT image and MRI image is different.
  • both the target domain image and the source domain image are used to identify the passable area on the ground, but the target domain image is obtained by the vehicle-mounted camera, and the source domain image is obtained by the laser radar.
  • the different representations of the images collected by the equipment result in differences in the ground and passable areas.
  • target domain image and source domain image can both be used to segment brain tumor tissue, but the target domain image and source domain image are medical images from different centers or different hospitals, that is, the distribution of tumor regions in the target domain image and the source domain image Not the same.
  • the aforementioned predicted segmentation result of the source domain image refers to the image after the target area is marked in the source domain image through the image segmentation model.
  • the aforementioned predicted segmentation result of the target domain image refers to the image after the target area is marked in the target domain image through the image segmentation model.
  • the image segmentation model can obtain the respective feature maps of the source domain image and the target domain image, and then determine the category of each pixel in the feature map.
  • the information is annotated to annotate the target area, that is, the predicted segmentation result of the image is obtained.
  • the image segmentation model needs to distinguish whether each pixel in the image belongs to the tumor area, and label the pixels belonging to the tumor area, so as to obtain an image of the tumor area.
  • Step 103 Use the predicted segmentation result of the source domain image and the predicted segmentation result of the target domain image to train the first discriminator.
  • the computer device After extracting the predicted segmentation result of the source domain image and the predicted segmentation result of the target domain image, the computer device inputs the aforementioned segmentation result into the first discriminator to train the first discriminator.
  • the above-mentioned first discriminator is used to discriminate whether the input segmentation result comes from the source domain or the target domain.
  • the trained first discriminator can determine as accurately as possible whether the input segmentation result comes from the source or the target domain.
  • the above-mentioned discriminator can be constructed using CNNs.
  • the CNNs may include multiple convolutional layers, such as 5 convolutional layers, each convolutional layer has a kernel size of 2, stride of 2, and padding. ) Is 1.
  • each of the first 4 layers can be followed by an activation function layer, which can be leaky ReLU layer, ReLU layer, RReLU layer, etc.; the output of the last convolutional layer is 2, which corresponds to the discriminator Discriminate the input predicted segmentation result category, such as from the source domain and from the target domain.
  • Step 104 Use the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image to train the second discriminator.
  • the computer device may also input the segmentation result and the standard segmentation result of the source domain image into the second discriminator to train the second discriminator.
  • the above-mentioned second discriminator is used to discriminate whether the input segmentation result is a predicted segmentation result or a standard segmentation result.
  • the trained second discriminator can determine whether the input segmentation result is a predicted segmentation result or a standard segmentation result as accurately as possible.
  • the above-mentioned second discriminator may also be constructed using CNNs, and its structure may be the same as or different from that of the first discriminator, which is not limited in the embodiment of the present application.
  • Step 105 Retrain the pre-trained image segmentation model according to the loss function of the pre-trained image segmentation model, the confrontation loss function of the first discriminator, and the confrontation loss function of the second discriminator, and iterative loop training until convergence is obtained Complete the trained image segmentation model.
  • the loss function of the aforementioned pre-trained image segmentation model is used to measure the segmentation accuracy of the image segmentation model.
  • the adversarial loss function of the first discriminator is used to measure the degree of difference between the predicted segmentation result of the target domain image and the predicted segmentation result of the source domain image.
  • the predicted segmentation result of the target domain image and the predicted segmentation result of the source domain image are input to the first discriminator for adversarial learning, where the first discriminator needs to determine as much as possible whether the input segmentation result is from the source domain or the target domain, and
  • the image segmentation model needs to segment the target domain image as accurately as possible, so that the first discriminator judges the segmentation result of the target domain image as coming from the source domain.
  • the adversarial loss function of the second discriminator is used to measure the degree of difference between the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image.
  • the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image are input to the second discriminator for adversarial learning, where the second discriminator needs to determine as much as possible whether the input segmentation result is the predicted segmentation result of the source domain image or Standard segmentation results, and the image segmentation model needs to accurately segment the source domain image as much as possible, so that the second discriminator judges the predicted segmentation result of the source domain image as the standard segmentation result.
  • adversarial learning the difference between the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image is reduced.
  • the computer device performs cyclic training on the pre-trained image segmentation model through the loss function of the pre-trained image segmentation model, the confrontation loss function of the first discriminator, and the confrontation loss function of the second discriminator, until the model converges, and the training is completed Image segmentation model.
  • the above-mentioned cyclic training of the pre-trained image segmentation model includes repeating the above steps 102 to 105, according to the value of the loss function of the image segmentation model obtained in each round of training, the value of the counter loss function of the first discriminator, and the second discriminant
  • the value of the counter loss function of the filter continuously adjusts the parameters of the image segmentation model until the model converges, and the trained image segmentation model is obtained.
  • the trained image segmentation model can reduce the difference between the source domain image and the target domain image, reduce the error of the trained image segmentation model in the target domain segmentation, and further make the target domain image output image visual information in the output space more precise.
  • Fig. 2 it exemplarily shows a schematic flowchart of an image segmentation model training method.
  • X S represents the source domain image
  • Y S represents a standard source gamut of the image segmentation results
  • X T represents the target field image
  • P S represents a division result the source gamut of the image
  • P T denotes segmentation result target gamut of the image
  • L D1 ( P T) represents a first discriminant classifiers loss function
  • L Adv1 (X T) against loss function represents a first discriminator
  • L D2 (P S) represents a discriminant loss function of the second discriminator
  • L Adv2 (X S ) Represents the adversarial loss function of the second discriminator
  • L Seg (X S ) represents the loss function of the pre-trained image segmentation model.
  • the computer device inputs the source domain image and the target domain image to the image segmentation model.
  • the image segmentation model can be a pre-trained image segmentation model to obtain the segmentation results of the source domain image and the target domain image. ; Input the segmentation result of the source domain image and the segmentation result of the target domain image to the first discriminator to obtain the discrimination result of the first discriminator, and further obtain the discriminant loss function and the counter loss function of the first discriminator; the source domain image
  • the segmentation results of and the standard segmentation results of the source domain image are input to the second discriminator to obtain the discrimination result of the second discriminator, and further obtain the discriminant loss function and the adversarial loss function of the second discriminator; then, the pre-trained image is segmented
  • the loss function of the model, the adversarial loss function of the first discriminator, and the adversarial loss function of the second discriminator are fed back to the image segmentation model, by minimizing the value of the loss function of the image segmentation model and the weighted sum
  • the technical solutions provided in the embodiments of this application can be applied to the model training process of image segmentation tasks in the AI (Artificial Intelligence) field, and are particularly suitable for training the image segmentation model of the data set with domain change problems.
  • the training data set may include multiple medical images taken from different medical devices.
  • the input is a medical image
  • the output is the segmentation result of the segmented lesion area
  • the image segmentation network is optimized through the first discriminator and the second discriminator, and the source domain image predicted by the image segmentation model
  • the segmentation result of the target domain image and the segmentation result of the target domain image are as close as possible to the standard segmentation result of the source domain image, and finally a more accurate image segmentation model is trained to assist doctors in making lesion diagnosis and analysis.
  • the predicted segmentation results of the source domain image and the target domain image are extracted through the image segmentation model pre-trained by source domain samples, and the source domain image and target domain image are further divided
  • the predicted segmentation result is input to the first discriminator
  • the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image are input to the second discriminator
  • the pre-trained image segmentation model is iteratively trained using the adversarial learning idea until The model converges, and the trained image segmentation model is obtained.
  • the technical solution provided by this application aligns the source domain image and the target domain image in the output space, so that the trained image segmentation model can reduce the difference between the source domain image and the target domain image in the output space, and reduce the trained image
  • the segmentation model's error in segmenting the target domain further makes the segmentation result of the target domain image more accurate.
  • the image segmentation model is further trained by the second discriminator, so that the segmentation result of the source domain image and the segmentation result of the target domain image predicted by the image segmentation model are as good as possible. It may be close to the standard segmentation result of the source domain image, which further improves the accuracy of the model.
  • FIG. 3 shows a flowchart of a method for training an image segmentation model provided by another embodiment of the present application.
  • the method can include the following steps (301-312):
  • Step 301 Use source domain samples to train the initial image segmentation model to obtain a pre-trained image segmentation model.
  • the foregoing source domain samples include source domain images and standard segmentation results of the source domain images.
  • the aforementioned image segmentation model may be a DeepLabv3+ model.
  • the DeepLabv3+ model includes ASPP (Atrous Spatial Pyramid Pooling) module and encoder-decoder structure, which combines the advantages of the two; ASPP can operate in multiple different ratios and different
  • the receptive field encodes texture information of different scales in the data, and the encoding-decoding structure can obtain clearer boundary information of the object by gradually restoring spatial information.
  • the aforementioned image segmentation model may also be a DeepLabv2 model, a RefineNet model, a ResNet model, etc., which is not limited in the embodiment of the present disclosure.
  • Step 302 Extract the predicted segmentation result of the source domain image and the predicted segmentation result of the target domain image through the pre-trained image segmentation model.
  • the source domain samples are again input into the pre-trained image segmentation model to extract the predicted segmentation of the source domain image result.
  • the source domain samples include a first sample set and a second sample set
  • the computer device can use the first sample set to train the initial image segmentation model to A pre-trained image segmentation model is obtained; afterwards, the second sample set can be used to retrain the pre-trained image segmentation model.
  • the input of the pre-trained image segmentation model is the second sample set in the source domain samples, and what is extracted is the predicted segmentation result of the source domain image in the second sample set.
  • the target domain image has been introduced in the embodiment of FIG. 1, and will not be repeated here.
  • the target domain image is input to the pre-trained image segmentation model to extract the predicted segmentation result of the target domain image.
  • Step 303 Input the predicted segmentation result of the source domain image and the predicted segmentation result of the target domain image to the first discriminator, respectively, to obtain the discrimination result of the first discriminator.
  • the above-mentioned first discriminator is used to discriminate whether the input segmentation result comes from the source domain or the target domain, that is, the first discriminator performs a two-classification task.
  • the result of the first discriminator can be 0 or 1. When the result is 0, it means that the input segmentation result is from the source domain; when the result is 1, it means that the input segmentation result is from the target domain.
  • Step 304 Calculate the value of the discrimination loss function of the first discriminator according to the discrimination result of the first discriminator.
  • the discrimination loss function of the first discriminator is used to measure the discrimination accuracy of the first discriminator.
  • the discriminant loss function L D1 (P T ) of the above-mentioned first discriminator can be expressed as:
  • P T is the predicted segmentation result of the target domain image, which can be expressed as:
  • G seg represents the image segmentation model
  • X T represents the target domain image
  • P T ⁇ R H ⁇ W ⁇ C the above H and W represent the height and width of the predicted segmentation of the target domain image
  • C represents the segmented category .
  • Step 305 Adjust the parameters of the first discriminator by minimizing the value of the discriminant loss function of the first discriminator.
  • the computer equipment can adjust the parameters of the first discriminator by minimizing the value of the discriminant loss function, so that the first discriminator can determine as accurately as possible whether the input segmentation result comes from the source domain or Target domain.
  • Step 306 Calculate the value of the counter loss function of the first discriminator according to the discrimination result of the predicted segmentation result of the target domain image by the first discriminator.
  • the adversarial loss function of the first discriminator is used to measure the degree of difference between the predicted segmentation result of the target domain image and the predicted segmentation result of the source domain image.
  • X T represents the target domain image
  • L MAE represents the mean absolute error function
  • Step 307 Input the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image to the second discriminator, respectively, to obtain the discrimination result of the second discriminator.
  • the above-mentioned second discriminator is used to discriminate whether the input segmentation result is a predicted segmentation result or a standard segmentation result, and the second discriminator also performs a two-classification task.
  • the result of the second discriminator can be a number between 0 and 1. When the result is 0, it means that the input segmentation result is from the source domain; when the result is 1, it means the input segmentation result is a source domain image.
  • the standard segmentation result is used to discriminate whether the input segmentation result is a predicted segmentation result or a standard segmentation result, and the second discriminator also performs a two-classification task.
  • the result of the second discriminator can be a number between 0 and 1. When the result is 0, it means that the input segmentation result is from the source domain; when the result is 1, it means the input segmentation result is a source domain image.
  • the standard segmentation result is used to discriminate whether the input segmentation result is a predicted segmentation result or a standard segmentation result.
  • Step 308 Calculate the value of the discrimination loss function of the second discriminator according to the discrimination result of the second discriminator.
  • the discrimination loss function of the second discriminator is used to measure the discrimination accuracy of the second discriminator.
  • the discriminant loss function L D2 (P S ) of the above-mentioned second discriminator can be expressed as:
  • P S represents the result of dividing the prediction image source domain
  • u is constant
  • the predicted segmentation result P S of the above source domain image can be expressed as:
  • G seg represents the image segmentation model
  • X S represents the source domain image
  • Step 309 Adjust the parameters of the second discriminator by minimizing the value of the discriminant loss function of the second discriminator.
  • the value of the discrimination loss function reflects the discrimination accuracy of the second discriminator, and is inversely proportional to the discrimination accuracy. Therefore, in the training process, the parameters of the second discriminator can be adjusted by minimizing the value of the discriminant loss function, so that the second discriminator can determine as accurately as possible whether the input segmentation result is the segmentation result of the source domain image or The standard segmentation result of the source image.
  • Step 310 Calculate the value of the counter loss function of the second discriminator according to the discrimination result of the predicted segmentation result of the source domain image by the second discriminator.
  • the adversarial loss function of the second discriminator is used to measure the degree of difference between the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image.
  • the adversarial loss function L Adv2 (X S ) of the above second discriminator can be expressed as
  • X S represents the source domain image
  • L MAE represents the mean absolute error function
  • Step 311 Construct an objective function according to the loss function of the pre-trained image segmentation model, the counter loss function of the first discriminator, and the counter loss function of the second discriminator.
  • the loss function L Seg (X S ) of the above-mentioned pre-trained image segmentation model can adopt a cross entropy (Cross Entropy, CE) loss function, which can be expressed as:
  • X S represents the source domain image
  • Y S represents the standard segmentation result of the source domain image
  • the objective function of the above image segmentation model training can be expressed as:
  • ⁇ Seg , ⁇ Adv1 and ⁇ Adv2 are adjustment parameters used to balance the loss function of the image segmentation model during the training process, the counter loss function of the first discriminator, and the counter loss function of the second discriminator.
  • Step 312 by minimizing the value of the loss function of the image segmentation model, the value of the weighted sum of the counter loss function of the first discriminator and the counter loss function of the second discriminator, and maximizing the discriminant loss function of the first discriminator, and The value of the discriminant loss function of the second discriminator is used to adjust the parameters of the pre-trained image segmentation model to obtain a trained image segmentation model.
  • the computer device After the computer device obtains the value of the loss function of the image segmentation model, and feeds back the value of the counter loss function of the first discriminator and the value of the counter loss function of the second discriminator to the image segmentation network; the image segmentation network adjusts its parameters, To minimize the value of the loss function of the image segmentation model, the weighted sum of the first discriminator’s counter loss function and the second discriminator’s counter loss function, and maximize the first discriminator’s discriminant loss function and the second The value of the discriminant loss function of the discriminator; through the adversarial training of the segmentation network and the discriminant network, the segmentation result of the source domain image and the segmentation result of the target domain image predicted by the image segmentation model are as close as possible to the standard segmentation result of the source domain image .
  • the segmentation result of the source image is gradually close to the segmentation result of the target image.
  • the segmentation result of the source image will gradually move away from the standard segmentation result of the source image, that is, segmentation The model's segmentation accuracy for the source domain image is reduced.
  • the adversarial loss function of the second discriminator by minimizing the adversarial loss function of the second discriminator, the segmentation result of the source domain image is gradually approached to the standard segmentation result of the source domain image, and further the source domain image segmentation result and target domain predicted by the image segmentation model The image segmentation result is as close as possible to the standard segmentation result of the source domain image.
  • the computer device stops training the model to obtain an image segmentation model that has completed the training.
  • the segmentation result of the trained image segmentation model for the target domain image is more similar to the standard segmentation result.
  • the stop training condition of the image segmentation model can be set in advance, such as the value of the loss function reaching a preset threshold, the number of training rounds reaching the preset number of rounds, or the training duration reaching the preset duration, etc. This embodiment of the application does not limit this .
  • the computer device before inputting the source domain image and the target domain image into the image segmentation model, performs normalization processing on the source domain image and the target domain image to obtain the processed source domain image and the processed target domain image For example, normalize the pixel value of each pixel in the source domain image and the target domain image to [-1, 1]; the above-mentioned processed source domain image and processed target domain image are used for image segmentation Model training.
  • the above-mentioned first discriminator and the second discriminator share parameters.
  • the parameters of the first discriminator and the second discriminator are shared in real time. For example, in each round of training, when the parameters of the first discriminator are updated, the updated parameters are synchronized to the second discriminator, the second discriminator is trained with the synchronized parameters, and the parameters are updated again, and Synchronize the updated parameters to the first discriminator.
  • the first discriminator and the second discriminator share parameters in real time, which helps to improve the training efficiency of the model.
  • the first discriminator and the second discriminator only share the initial training parameters, and then update the parameters independently. In this case, you can train the first discriminator first, and then train the second discriminator; you can also train the second discriminator first, and then the first discriminator; you can also train the first discriminator and the second discriminator at the same time This embodiment of the application does not limit this.
  • the initial learning rates of the image segmentation network, the first discriminator and the second discriminator are preset values.
  • the initial learning rates of the image segmentation network, the first discriminator and the second discriminator are 1.5 ⁇ 10 -5 , 1 ⁇ 10 -5 and 1 ⁇ 10 -5, respectively .
  • the technical solution provided by the embodiments of the present application inputs the segmentation result of the source domain image and the segmentation result of the target domain image to the first discriminator, and the segmentation result of the source domain image and the standard segmentation of the source domain image
  • the result is input to the second discriminator to obtain the discriminant loss function and adversarial loss function of the first discriminator, and the discriminant loss function and adversarial loss function of the second discriminator; after that, the loss function and the first discriminator of the pre-trained image segmentation model
  • the counter loss function of a discriminator and the counter loss function of the second discriminator are fed back to the image segmentation model, by minimizing the value of the loss function of the image segmentation model, the counter loss function of the first discriminator and the counter loss of the second discriminator
  • the value of the weighted sum of the function and maximize the discriminant loss function of the first discriminator and the discriminant loss function of the second discriminator to adjust the parameters of the pre-trained image segmentation model to obtain the trained image segmentation model.
  • the source domain image and the target domain image are input to the image segmentation model, the source domain image and the target domain image are normalized, so that the input image and the discriminator's discrimination result are processed In the same dimension, further better training and optimization of image segmentation models.
  • FIG. 4 it exemplarily shows a schematic diagram of the segmentation results in different segmentation modes.
  • (a) represents the source domain image
  • (b) represents the target domain image
  • (c) represents the image segmentation model obtained by training the image segmentation model using only the source domain image, and the segmentation result of the source domain image
  • (d) represents the use The standard segmentation result training of the source domain image and the source domain image, and the image segmentation model obtained without domain adaptive training, the segmentation result of the target domain image
  • (e) indicates the completed training obtained by the training method provided by this scheme Image segmentation model, the segmentation result of the target domain image
  • (f) represents the standard segmentation result of the target domain image.
  • the trained image segmentation model can be deployed in a computer device.
  • the computer device obtains an image to be segmented from the target domain
  • the trained image segmentation model is called to The target area in the image is accurately segmented, and the segmentation result of the image to be segmented is obtained.
  • the trained image segmentation model is deployed in the auxiliary diagnosis platform.
  • the auxiliary diagnosis platform can directly segment the accurate distribution information of the lesion area when collecting the patient's medical image , In order to facilitate the doctor to make an accurate diagnosis.
  • the three data sets are the BRATS (Brain Tumor Segmentation) 2018 data set, the private brain glioma data set, and the multi-center SCGM (Spinal Cord Gray Matter) 2017 data set.
  • BRATS Brain Tumor Segmentation
  • SCGM Spinal Cord Gray Matter
  • the BRATS 2018 data set includes 285 samples with label sets, and each sample has 4 modalities, namely FLAIR (FLuid Attenuated Inversion Recovery), T1 enhancement, T1 MRI and T2 MRI.
  • the preprocessing of the above data includes skull stripping, registration and resampling to a resolution of 1 ⁇ 1 ⁇ 1mm 3 , and the dimension of each sample is 240 ⁇ 240 ⁇ 155.
  • the T2 MRI data set in this data set was used, and the 3D T2 MRI axial view was converted into a multi-layer 2D image.
  • the private data set of glioma includes 200 samples with a label set. Each sample has only a thick 3D T2 MRI data set, and the label set only marks the tumor edema area (that is, the entire tumor area). Since the data set was scanned in a thick layer, that is, only the axial image is clearly structured, and the images of the other two views (ie, the coronal image and the sagittal image) are very blurred. Therefore, in the test process, only the axial plane is used, and the 3D T2 MRI axial view is converted into a multi-layer 2D image, and the 2D image is resampled to a size of 513 ⁇ 513. In addition, the above data preprocessing is only skull peeling .
  • the SCGM 2017 data set includes data from 4 different centers, a total of 40 samples with label sets.
  • the dimensions of the data range from 0.25 ⁇ 0.25 ⁇ 0.25mm 3 to 0.5 ⁇ 0.5 ⁇ 0.5mm 3 , and the 3D T2 MRI axis view is converted into a multilayer 2D image.
  • Test 1 using BRATS 2018 data as the source domain data, and private glioma data as the target domain data
  • Test 2 using the private glioma data as the source domain Data
  • BRATS 2018 data is used as target domain data.
  • DeepLabv2 is also used as the segmentation model for comparison.
  • ADDA Advanced Discriminative Domain Adaptation, unsupervised domain adaptation
  • Table-1 shows the test results of Test 1 and Test 2 in the brain tumor segmentation task.
  • the first line is the measurement index of image segmentation; among them, the Dice coefficient (Dice Score) is used to measure the similarity of two sets; Sensitivity represents the proportion of results with accurate segmentation among all test results; Specificity (Specificity ) Means; Hausdorff Distance is a distance defined between any two sets in the metric space.
  • Dice coefficient, sensitivity and specificity are directly proportional to the accuracy of the image segmentation model, and the Stoff distance is inversely proportional to the accuracy of the image segmentation model.
  • the P in the second row indicates that the private data of glioma is used as the source domain data, the BRATS 2018 data is used as the target domain data, and the B in the second row indicates that the BRATS 2018 data is used as the source domain data, and the private data of glioma is used as the target domain. data.
  • the third row to the fifth row represent the test results obtained by using the DeepLabv3+ and DeepLabv2 segmentation models and the ADDA segmentation algorithm.
  • the sixth line 1 Ours represents the test result obtained by using DeepLabv2 as the segmentation model in this scheme.
  • the seventh line 2 Ours represents the test result obtained by using DeepLabv3+ as the segmentation model in this scheme.
  • FIG. 5 shows example diagrams of brain tumor segmentation results in different segmentation methods.
  • Line P indicates that the private data of glioma is used as the source domain data, and the BRATS 2018 data is used as the target domain data.
  • Line B indicates that the BRATS 2018 data is used as the source domain data, and the private glioma data is obtained as the target domain data.
  • the first column Axial represents the data axis chart; the second column GT (Ground Truth, segmentation standard) chart; the third column represents the test results obtained after using DeepLabv3+ as the segmentation model and performing DA (Domain Adaptation).
  • the test results of this scheme indicates the test results obtained after only using DeepLabv3+ as the segmentation model without DA; the fifth column indicates the test results obtained after using DeepLabv2 as the segmentation model and performing DA; the sixth column Indicates that only DeepLabv3+ is used as the segmentation model, and the test result obtained by DA is not performed. It can be intuitively seen from FIG. 5 that the technical solution of the present application can accurately segment the target domain image.
  • test 1 In the gray matter segmentation task of spinal cord, two test schemes were designed with reference to related technical schemes: test 1, using the data of center 1 and center 2 as the source domain data, and using the data of center 3 as the target domain data; test 2, the center The data of 1 and 2 are taken as the source domain data, and the data of center 4 is taken as the target domain data. It is also compared with the segmentation results of two related technical solutions, namely, the EMA (Exponential Moving Average) segmentation model and the UDASE (Unsupervised Domain Adaptation with Self-Ensembling, self-integration unsupervised domain adaptation) segmentation model.
  • the two test schemes designed in this application are the same as the test schemes provided in the related technologies, so as to compare the effects of the technical solutions provided in this application with those provided by the related technologies.
  • Table-2 shows the test results of Test 1 and Test 2 in the spinal cord gray matter segmentation task.
  • the row of DeepLabv3+ represents the test results obtained by using DeepLabv3+ as the segmentation model; the test results of related technologies where EMA and UDASE are located; the row of Ours represents the test results of this solution.
  • Table-2 that for test 2, that is, in the test where the source domain data of centers 1 and 2 are adapted to center 4, the segmentation performance of the segmentation model of this scheme is significantly better than the scheme provided by related technologies.
  • the segmentation performance of the segmentation model of this scheme is compared with the test results that only use DeepLabv3+ for segmentation without self-adaptation, this scheme can be significantly Improve the segmentation performance of target domain data.
  • FIG. 6 shows example diagrams of spinal cord gray matter segmentation results under different segmentation methods.
  • the first and fourth columns represent the segmentation results obtained by using related technologies in test 1 and test 2, respectively;
  • the second and fifth columns represent the use of DeepLabv3+ as the segmentation model in test 1 and test 2, and after DA
  • the test results obtained are the test results of this scheme;
  • the third and sixth columns indicate the test results obtained by using only DeepLabv3+ as the segmentation model in Test 1 and Test 2, without DA.
  • the technical solution provided by this application uses DeepLabv3+ as the segmentation model and performs field adaptation in the output space to improve the segmentation performance and generalization ability of the finally trained image segmentation model, so that the image segmentation model can The target image is segmented accurately, and the segmentation result of the target image is more accurate.
  • FIG. 7 shows a block diagram of an image segmentation model training device provided by an embodiment of the present application.
  • the device has the function of realizing the example of the training method of the image segmentation model, and the function can be realized by hardware, or by hardware executing corresponding software.
  • the device can be the computer equipment described above, or it can be set on the computer equipment.
  • the device 700 may include: a first training module 710, a result extraction module 720, a second training module 730, a third training module 740, and a fourth training module 750.
  • the first training module 710 is configured to train an initial image segmentation model using source domain samples to obtain a pre-trained image segmentation model.
  • the source domain samples include source domain images and standard segmentation results of the source domain images.
  • the result extraction module 720 is configured to extract the predicted segmentation result of the source domain image and the predicted segmentation result of the target domain image through the pre-trained image segmentation model.
  • the second training module 730 is configured to use the predicted segmentation result of the source domain image and the predicted segmentation result of the target domain image to train the first discriminator; wherein, the first discriminator is used to discriminate input segmentation Whether the result comes from the source domain or the target domain.
  • the third training module 740 is configured to use the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image to train a second discriminator; wherein, the second discriminator is used to discriminate input segmentation Whether the result is a predicted segmentation result or a standard segmentation result.
  • the fourth training module 750 is configured to perform processing on the pre-trained image according to the loss function of the pre-trained image segmentation model, the confrontation loss function of the first discriminator, and the confrontation loss function of the second discriminator
  • the segmentation model is retrained, and the iterative loop training is performed until convergence to obtain the image segmentation model that has been trained.
  • the predicted segmentation results of the source domain image and the target domain image are extracted through the image segmentation model pre-trained by source domain samples, and the source domain image and target domain image are further divided
  • the predicted segmentation result is input to the first discriminator
  • the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image are input to the second discriminator
  • the pre-trained image segmentation model is retrained using the adversarial learning idea, and so on.
  • the technical solution provided by this application aligns the source domain image and the target domain image in the output space, so that the trained image segmentation model can reduce the difference between the source domain image and the target domain image in the output space, and reduce the trained image
  • the segmentation model's error in segmenting the target domain further makes the segmentation result of the target domain image more accurate.
  • the second training module 730 is configured to input the predicted segmentation result of the source domain image and the predicted segmentation result of the target domain image to the first discriminator to obtain the The discrimination result of the first discriminator; according to the discrimination result of the first discriminator, the value of the discriminant loss function of the first discriminator is calculated, wherein the discriminant loss function of the first discriminator is used to measure the The discrimination accuracy of the first discriminator; the parameters of the first discriminator are adjusted by minimizing the value of the discrimination loss function of the first discriminator.
  • the device 700 further includes: a first calculation module 760.
  • the first calculation module 760 is configured to calculate the value of the counter loss function of the first discriminator according to the discrimination result of the prediction segmentation result of the target domain image by the first discriminator; wherein, the first discriminator
  • the countermeasure loss function of the detector is used to measure the degree of difference between the predicted segmentation result of the target domain image and the predicted segmentation result of the source domain image.
  • the third training module 740 is configured to input the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image to the second discriminator to obtain the The discrimination result of the second discriminator; according to the discrimination result of the second discriminator, the value of the discriminant loss function of the second discriminator is calculated, wherein the discriminant loss function of the second discriminator is used to measure the The discrimination accuracy of the second discriminator; the parameters of the second discriminator are adjusted by minimizing the value of the discrimination loss function of the second discriminator.
  • the apparatus 700 further includes: a second calculation module 770.
  • the second calculation module 770 is configured to calculate the value of the counter loss function of the second discriminator according to the discrimination result of the predicted segmentation result of the source domain image by the second discriminator; wherein, the second discriminator
  • the countermeasure loss function of the filter is used to measure the degree of difference between the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image.
  • the fourth training module 750 is configured to use the loss function of the pre-trained image segmentation model, the confrontation loss function of the first discriminator, and the confrontation loss of the second discriminator Function to construct an objective function; by minimizing the value of the weighted sum of the loss function of the image segmentation model, the counter loss function of the first discriminator and the counter loss function of the second discriminator, and maximizing the The value of the discriminant loss function of the first discriminator and the value of the discriminant loss function of the second discriminator are used to adjust the parameters of the pre-trained image segmentation model to obtain the completed image segmentation model.
  • the first discriminator and the second discriminator share parameters.
  • the apparatus 700 further includes: an image processing module 780.
  • the image processing module 780 is configured to perform normalization processing on the source domain image and the target domain image to obtain a processed source domain image and a processed target domain image; wherein the processed source domain image And the processed target domain image is used for training of the image segmentation model.
  • an image segmentation device provided by an embodiment of the present application.
  • the device has the function of realizing the above example of the image segmentation method, and the function can be realized by hardware, or by hardware executing corresponding software.
  • the device can be the computer equipment described above, or it can be set on the computer equipment.
  • the device may include: an acquisition module and a calling module.
  • the acquisition module is used to acquire the image to be segmented from the target domain
  • the calling module is used to call the trained image segmentation model to process the image to be segmented to obtain the segmentation result of the image to be segmented.
  • the image segmentation model after the training passes the first discriminator and the second discriminator,
  • the output space is obtained by training the image segmentation model by adversarial learning;
  • the first discriminator is used to reduce the difference between the predicted segmentation result of the target domain image and the predicted segmentation result of the source domain image in the process of training the image segmentation model; the second discriminator is used to In the process of training the image segmentation model, the difference between the predicted segmentation result of the source domain image and the standard segmentation result of the source domain image is reduced.
  • the device provided in the above embodiment when implementing its functions, only uses the division of the above functional modules for illustration. In practical applications, the above functions can be allocated by different functional modules as needed, namely The internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the apparatus and method embodiments provided by the above-mentioned embodiments belong to the same concept, and the specific implementation process is detailed in the method embodiments, which will not be repeated here.
  • FIG. 9 shows a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • the computer device can be any electronic device with data processing and storage functions, such as a PC or server.
  • the computer device is used to implement the training method of the image segmentation model provided in the foregoing embodiment. Specifically:
  • the computer device 900 includes a central processing unit (CPU) 901, a system memory 904 including a random access memory (RAM) 902 and a read-only memory (ROM) 903, and a system bus 905 connecting the system memory 904 and the central processing unit 901 .
  • the computer device 900 also includes a basic input/output system (I/O system) 906 that helps to transfer information between various devices in the computer, and a large-capacity storage system 913, application programs 914, and other program modules 912. Storage device 907.
  • I/O system basic input/output system
  • the basic input/output system 906 includes a display 908 for displaying information and an input device 909 such as a mouse and a keyboard for the user to input information.
  • the display 908 and the input device 909 are both connected to the central processing unit 901 through the input and output controller 910 connected to the system bus 905.
  • the basic input/output system 906 may also include an input and output controller 910 for receiving and processing input from multiple other devices such as a keyboard, a mouse, or an electronic stylus.
  • the input and output controller 910 also provides output to a display screen, a printer, or other types of output devices.
  • the mass storage device 907 is connected to the central processing unit 901 through a mass storage controller (not shown) connected to the system bus 905.
  • the mass storage device 907 and its associated computer readable medium provide non-volatile storage for the computer device 900. That is, the mass storage device 907 may include a computer-readable medium (not shown) such as a hard disk or a CD-ROM drive.
  • Computer-readable media may include computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storing information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Channel
  • memory bus Rabus direct RAM
  • DRDRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM
  • the computer device 900 may also be connected to a remote computer on the network through a network such as the Internet to run. That is, the computer device 900 can be connected to the network 912 through the network interface unit 911 connected to the system bus 905, or in other words, the network interface unit 911 can also be used to connect to other types of networks or remote computer systems (not shown) ).
  • the memory also includes at least one instruction, at least one program, code set, or instruction set.
  • the at least one instruction, at least one program, code set or instruction set is stored in the memory and configured to be used by one or more processors. Execute to realize the above-mentioned image segmentation model training method or image segmentation method.
  • a computer device is also provided.
  • the computer device can be a terminal or a computer device.
  • the computer device includes a processor and a memory, and the memory stores at least one instruction, at least one program, code set or instruction set, and the at least one instruction, the at least one program, the code set or the instruction set consists of
  • the processor loads and executes to implement the training method of the image segmentation model, or the image segmentation method.
  • a computer-readable storage medium stores at least one instruction, at least one program, a code set, or an instruction set, the at least one instruction, the at least one program ,
  • the code set or the instruction set implements the above-mentioned image segmentation model training method or image segmentation method when executed by the processor.
  • a computer program product is also provided.
  • the computer program product When the computer program product is executed by a processor, it is used to implement the image segmentation model training method or the image segmentation method provided in the foregoing embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Epidemiology (AREA)
  • Mathematical Physics (AREA)
  • Primary Health Care (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Pathology (AREA)
  • Quality & Reliability (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供了一种图像分割模型的训练方法、装置、设备和存储介质。所述方法包括:采用源域样本对初始的图像分割模型进行训练,得到预训练的图像分割模型;通过预训练的图像分割模型,提取源域图像的预测分割结果和目标域图像的预测分割结果;采用源域图像的预测分割结果和目标域图像的预测分割结果,对第一判别器进行训练;采用源域图像的预测分割结果和源域图像的标准分割结果,对第二判别器进行训练;根据预训练的图像分割模型的损失函数、第一判别器的对抗损失函数以及第二判别器的对抗损失函数,对预训练的图像分割模型进行再训练,如此迭代循环训练直至收敛得到完成训练的图像分割模型。

Description

图像分割模型的训练方法、装置、计算机设备和存储介质
本申请要求于2019年05月27日提交中国专利局,申请号为2019104480956,发明名称为“图像分割模型的训练方法、装置、设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及图像识别技术领域,特别涉及一种图像分割模型的训练方法、装置、计算机设备和存储介质。
背景技术
图像分割是指对图像中的每个像素进行分类,并标注出目标区域。图像分割可以应用于医疗图像分析、无人车驾驶、地理信息***、水下物体检测等领域。例如,在医疗图像分析领域,图像分割可用于实现如肿瘤和其它病灶的定位、组织体积的测量、解剖学结构的研究等工作。
传统的图像分割方法依赖于大量的已标注的图像,且该方法的前提假设是训练图像集(即源域图像)和测试图像集(即目标域图像)的数据分布相一致。但是,在实际应用中,复杂多样的图像数据分布很难满足这一前提假设,因此导致在特定图像集上训练的模型泛化能力差,在来自不同域的图像集或者有域变化的图像集上的测试性能下降很多。
在相关技术中,在训练图像分割模型的过程中,是将源域图像和目标域图像在特征空间对齐,使得最终训练得到的模型可以在特征空间适应于目标域图像。但是,从特征空间传递至输出空间,图像还会经过多步处理,从而导致目标域图像在输出空间输出的图像分割结果不够准确。
发明内容
根据本申请的各种实施例,提供了一种图像分割模型的训练方法、装置、计算机设备和存储介质。
一种图像分割模型的训练方法,由计算机设备执行,所述方法包括:
采用源域样本对初始的图像分割模型进行训练,得到预训练的图像分割模型,所述源域样本包括源域图像和所述源域图像的标准分割结果;
通过所述预训练的图像分割模型,提取所述源域图像的预测分割结果和目标域图像的预测分割结果;
采用所述源域图像的预测分割结果和所述目标域图像的预测分割结果,对第一判别器进行训练;其中,所述第一判别器用于判别输入的分割结果来自于源域还是目标域;
采用所述源域图像的预测分割结果和所述源域图像的标准分割结果,对第二判别器进行训练;其中,所述第二判别器用于判别输入的分割结果为预测分割结果还是标准分割结果;
根据所述预训练的图像分割模型的损失函数、所述第一判别器的对抗损失函数以及所述第二判别器的对抗损失函数,对所述预训练的图像分割模型进行再训练,如此迭代循环训练直至收敛得到完成训练的图像分割模型。
一种图像分割方法,由计算机设备执行,所述方法包括:
获取来自目标域的待分割图像;
调用完成训练的图像分割模型处理所述待分割图像,得到所述待分割图像的分割结果,所述完成训练的图像分割模型是通过第一判别器和第二判别器,在输出空间采用对抗学习对图像分割模型进行训练得到的;
其中,所述第一判别器用于在训练所述图像分割模型的过程中,减小目标域图像的预测分割结果与源域图像的预测分割结果之间的差异;所述第二判别器用于在训练所述图像分割模型的过程中,减小所述源域图像的预测分割结果与所述源域图像的标准分割结果之间的差异。
一种图像分割模型的训练装置,所述装置包括:
第一训练模块,用于采用源域样本对初始的图像分割模型进行训练,得到预训练的图像分割模型,所述源域样本包括源域图像和所述源域图像的标准分割结果;
结果提取模块,用于通过所述预训练的图像分割模型,提取所述源域图像的预测分割结果和目标域图像的预测分割结果;
第二训练模块,用于采用所述源域图像的预测分割结果和所述目标域图像的预测分割结果,对第一判别器进行训练;其中,所述第一判别器用于判别输入的分割结果来自于源域还是目标域;
第三训练模块,用于采用所述源域图像的预测分割结果和所述源域图像的标准分割结果,对第二判别器进行训练;其中,所述第二判别器用于判别输入的分割结果为预测分割结果还是标准分割结果;
第四训练模块,用于根据所述预训练的图像分割模型的损失函数、所述第一判别器的对抗损失函数以及所述第二判别器的对抗损失函数,对所述预训练的图像分割模型进行再训练,如此迭代循环训练直至收敛得到完成训练的图像分割模型。
一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现上述图像分割模型的训练方法,或者实现上述图像分割方法。
一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现上述图像分割模型的训练方法,或者实现上述图像分割方法。
一种计算机程序产品,当该计算机程序产品被执行时,其用于执行上述图像分割模型的训练方法,或者实现上述图像分割方法。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优 点将从说明书、附图以及权利要求书变得明显。
附图说明
图1是本申请一个实施例提供的图像分割模型的训练方法的流程图;
图2示例性示出了图像分割模型训练方法的流程示意图;
图3是本申请另一个实施例提供的图像分割模型的训练方法的流程图;
图4示例性示出了不同分割方式下的分割结果的示意图;
图5示出了不同分割方式下脑肿瘤分割结果的示例图;
图6示出了不同分割方式下脊髓灰质分割结果的示例图;
图7是本申请一个实施例提供的图像分割模型的训练装置的框图;
图8是本申请一个实施例提供的图像分割模型的训练装置的框图;
图9是本申请一个实施例提供的计算机设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
图像分割是指对图像中的每个像素进行分类,并标注出目标区域。图像分割可以应用于医疗图像分析、无人车驾驶、地理信息***、水下物体检测等领域。在医疗图像分析领域,图像分割可用于实现如肿瘤和其它病灶的定位、组织体积的测量、解剖学结构的研究等。在无人车驾驶领域,图像分割可用于在车载摄像头或者激光雷达获取到环境图像后,对环境图像进行处理,检测地面并识别出可通行区域,进而规划出行驶路径。在地理信息***领域,图像分割可用于在采集到卫星遥感影像后,对卫星遥感影像进行处理,识别道路、河流、庄稼、建筑物等,并对影像中的每个像素进行标注。
本申请实施例提供的技术方案中,基于DCNNs(Deep Convolutional Neural Networks,深度卷积神经网络)和对抗学习思想提出了一种在输出空间进行领域自适应的图像分割模型。通过经源域样本预训练的图像分割模型,提取源域图像和目标域图像的预测分割结果,进一步将源域图像和目标域图像的预测分割结果输入至第一判别器,将源域图像的预测分割结果和源域图像的标准分割结果输入至第二判别器,采用对抗学习思想对预训练的图像分割模型进行循环训练迭代直至模型收敛,得到完成训练的图像分割模型。本申请提供的技术方案将源域图像和目标域图像在输出空间进行对齐,使得完成训练的图像分割模型可以在输出空间减小源域图像和目标域图像之间的差别,降低所训练的图像分割模型在对目标域分割的误差,进一步使得目标域图像的分割结果更加准确。
本申请实施例提供的方法,各步骤的执行主体可以是计算机设备,该计算机设备是指具备数据计算、处理和存储能力的电子设备,如PC(Personal Computer,个人计算机)或服务 器。
请参考图1,其示出了本申请一个实施例提供的图像分割模型的训练方法的流程图。该方法应用于计算机设备,可以包括以下几个步骤(101~105):
步骤101,采用源域样本对初始的图像分割模型进行训练,得到预训练的图像分割模型。
上述源域样本包括源域图像和源域图像的标准分割结果。该源域图像可以是图像采集装置(如摄像头、医疗设备、激光雷达等)采集的图像,也可以是预先存储在本地的图像,还可以是从网络上获取的图像,本申请实施例对此不作限定。另外,上述源域图像可以是图片格式的图像,也可以是视频图像,在本申请实施例中,对上述源域图像的格式也不作限定。
上述源域图像中存在有目标区域,该目标区域可以是用户感兴趣的区域,如一张风景图像中的人物区域、动物区域、植物区域或其它指定区域;也可以是一张医疗图像中的组织器官区域、细胞区域或病灶区域等,本申请实施例对此不作限定。
上述源域图像的标准分割结果是指准确标注出目标区域的源域图像,即真实的分割标签。该标准分割结果可以是由专业人员人工标注的。
以上述源域图像为医疗图像为例,目标区域可以是该医疗图像中的病灶区域,则上述源域图像的标准分割结果是在该医疗图像中,准确标注病灶区域后的医疗图像,有利于临床诊治和医疗研究。如对于病人身体某个部位的一张医疗图像,若图像中存在肿瘤区域,临床医生或其他相关人员需要得到较为准确的肿瘤区域位置以便于临床诊治和医疗研究,则肿瘤区域即为医疗图像的目标区域。上述图像分割模型用于在输入该图像分割模型的图像中分割出目标区域,得到输入图像对应的分割结果。由于源域样本包括源域图像和源域图像的标准分割结果,因此可以采用源域样本来训练初始的图像分割模型,更新该初始的图像分割模型的相关参数,从而得到预训练的图像分割模型,该预训练的图像分割模型相较于初始的图像分割模型,针对同一张图像的分割结果更加准确。
上述图像分割模型的框架结构可以是CNN(Convolutional Neural Networks,卷积神经网络)、DCNN、ResNet(Residual Neural Network,残差网络)、DenseNet(Densely Connected Convolutional Networks,稠密的卷积神经网络)等等,还可以是其它可用于图像分割的模型结构,本申请实施例对此不作限定。
步骤102,通过预训练的图像分割模型,提取源域图像的预测分割结果和目标域图像的预测分割结果。
上述目标域图像是与源域图像属于同一类任务的图像,但图像数据分布不同。例如,在医疗图像分析领域,目标域图像和源域图像均是为了检测肿瘤区域,但是目标域图像和源域图像是来自不同的采集设备,或者来自不同的医院或不同中心,均会导致目标域图像和源域图像的分布有较大的差异。还例如目标域图像是CT(Computed Tomography,电子计算机断层扫描)图像,源域图像是MRI(Magnetic Resonance Imaging,核磁共振图像),由于不同的两种医疗图像侧重表现的信息不同,导致肿瘤区域在CT图像和MRI图像中的分布存在差异。再例如,在无人车驾驶领域,目标域图像和源域图像均是为了标识地面识别可通行区域, 但目标域图像是采用车载摄像头获取的,源域图像是采用激光雷达获取的,由于不同设备采集的图像表现形式不同,导致地面以及可通行区域存在差异。
以医疗图像为例,由于医疗图像有多种模态,如MRI、CT、PET(Positron Emission Computed Tomography,正电子发射型计算机断层显像)、PD(Proton Density Weighted Images,质子密度加权图像)等等,而不同模态的医疗图像对于同一区域的分布会有不同程度的改变,这就是域变化问题;同样的,当采集的医疗图像是来自不同医院(中心)的不同的影像设备时,其相同模态的医疗图像的数据分布也存在较大差异,这也是域变化问题。上述目标域图像和源域图像可以都是为了分割出脑肿瘤组织,但目标域图像和源域图像是来自不同中心或者不同医院的医疗图像,即目标域图像和源域图像中肿瘤区域的分布不相同。
上述源域图像的预测分割结果是指通过图像分割模型在源域图像中标注出目标区域后的图像。上述目标域图像的预测分割结果是指通过图像分割模型在目标域图像中标注出目标区域后的图像。
计算机设备将源域图像和目标域图像输入至预训练的图像分割模型之后,图像分割模型可以获取到源域图像和目标域图像各自的特征图,然后对特征图中每个像素点所属的类别信息进行标注,以标注出目标区域,即得到图像的预测分割结果。继续以在医疗图像中分割肿瘤区域为例,图像分割模型需要区分图像中每个像素点是否属于肿瘤区域,并对属于肿瘤区域的像素点进行标注,从而得到分割出肿瘤区域的图像。
步骤103,采用源域图像的预测分割结果和目标域图像的预测分割结果,对第一判别器进行训练。
在提取到源域图像的预测分割结果和目标域图像的预测分割结果之后,计算机设备将上述分割结果输入第一判别器,以训练该第一判别器。其中,上述第一判别器用于判别输入的分割结果来自于源域还是目标域。通过对第一判别器训练,可以使得训练后的第一判别器尽可能准确地判别出输入的分割结果是来自源于还是目标域。
可选地,上述判别器可以采用CNNs构建。示例性地,该CNNs可以包括多个卷积层,如包括5个卷积层,每个卷积层的卷积核尺寸(kernel size)为2,步长(stride)为2,填充(padding)为1。另外,前4层中每一层后都可以跟随一个激活函数层,该激活函数层可以是leaky ReLU层、ReLU层、RReLU层等;最后一层卷积层的输出为2,对应于判别器判别的输入的预测分割结果的类别,如来自源域和来自目标域。
步骤104,采用源域图像的预测分割结果和源域图像的标准分割结果,对第二判别器进行训练。
计算机设备在提取到源域图像的预测分割结果之后,还可以将该分割结果和源域图像的标准分割结果输入第二判别器,以训练该第二判别器。其中,上述第二判别器用于判别输入的分割结果为预测分割结果还是标准分割结果。通过对第二判别器训练,可以使得训练后的第二判别器尽可能准确地判别出输入的分割结果是预测分割结果还是标准分割结果。
可选地,上述第二判别器也可以采用CNNs构建,其结构可以与第一判别器的结构相同或不同,本申请实施例对此不作限定。
步骤105,根据预训练的图像分割模型的损失函数、第一判别器的对抗损失函数以及第二判别器的对抗损失函数,对预训练的图像分割模型进行再训练,如此迭代循环训练直至收敛得到完成训练的图像分割模型。
上述预训练的图像分割模型的损失函数用于衡量图像分割模型的分割准确度。
上述第一判别器的对抗损失函数用于衡量目标域图像的预测分割结果与源域图像的预测分割结果之间的差异程度。目标域图像的预测分割结果与源域图像的预测分割结果输入至第一判别器进行对抗学习,其中,第一判别器需要尽可能地判别出输入的分割结果是来自源域还是目标域,而图像分割模型需要尽可能地将目标域图像准确分割,以使得第一判别器将目标域图像的分割结果判别为来自于源域。通过这样一个对抗学习的过程,以提高图像分割模型的分割的准确度。
上述第二判别器的对抗损失函数用于衡量源域图像的预测分割结果与源域图像的标准分割结果之间的差异程度。源域图像的预测分割结果与源域图像的标准分割结果输入至第二判别器进行对抗学习,其中,第二判别器需要尽可能地判别出输入的分割结果是源域图像的预测分割结果还是标准分割结果,而图像分割模型需要尽可能地将源域图像准确分割,以使得第二判别器将源域图像的预测分割结果判别为标准分割结果。通过这样一个对抗学习的过程,以减小源域图像的预测分割结果和源域图像的标准分割结果之间的差异。
计算机设备通过预训练的图像分割模型的损失函数、第一判别器的对抗损失函数以及第二判别器的对抗损失函数,对预训练的图像分割模型进行循环训练,直至模型收敛,得到完成训练的图像分割模型。上述对预训练的图像分割模型进行循环训练包括重复执行上述步骤102至步骤105,根据每轮训练得到的图像分割模型的损失函数的值、第一判别器的对抗损失函数的值以及第二判别器的对抗损失函数的值不断调整图像分割模型的参数,直至模型收敛,得到完成训练的图像分割模型。该完成训练的图像分割模型可以减小源域图像和目标域图像之间的差别,降低所训练的图像分割模型在对目标域分割的误差,进一步使得目标域图像在输出空间输出的图像视觉信息更加准确。
结合参考图2,其示例性示出了图像分割模型训练方法的流程示意图。其中,X S表示源域图像,Y S表示源域图像的标准分割结果,X T表示目标域图像,P S表示源域图像的分割结果,P T表示目标域图像的分割结果,L D1(P T)表示第一判别器的判别损失函数,L Adv1(X T)表示第一判别器的对抗损失函数,L D2(P S)表示第二判别器的判别损失函数,L Adv2(X S)表示第二判别器的对抗损失函数,L Seg(X S)表示预训练的图像分割模型的损失函数。
如图2所示,计算机设备将源域图像和目标域图像输入至图像分割模型,该图像分割模型可以是经过预训练的图像分割模型,得到源域图像的分割结果和目标域图像的分割结果;将源域图像的分割结果和目标域图像的分割结果输入至第一判别器,得到第一判别器的判别结果,进一步得到第一判别器的判别损失函数和对抗损失函数;将源域图像的分割结果以及源域图像的标准分割结果输入至第二判别器,得到第二判别器的判别结果,进一步得到第二判别器的判别损失函数和对抗损失函数;之后,将预训练的图像分割模型的损失函数、第一判别器的对抗损失函数以及第二判别器的对抗损失函数反馈至图像分割模型,通过最小化图 像分割模型的损失函数的值与两个对抗损失函数加权和的值,并最大化第一判别器的判别损失函数以及第二判别器的判别损失函数的值,来调整预训练的图像分割模型的参数,得到完成训练的图像分割模型。该完成训练的图像分割模能够准确分割来自目标域的图像,具有良好的分割性能和泛化能力。
本申请实施例提供的技术方案,可应用于AI(Artificial Intelligence,人工智能)领域的图像分割任务的模型训练过程中,特别适用于训练数据集有域变化问题的数据集的图像分割模型的训练过程中。以对不同模态的医疗图像的分割任务为例,训练数据集可以包括多张从不同的医疗设备拍摄的医疗图像。在这种应用场景下,其输入是一张从医疗图像,输出是分割出病灶区域分割结果;然后通过第一判别器和第二判别器优化图像分割网络,图像分割模型预测出的源域图像的分割结果和目标域图像的分割结果尽可能的与源域图像的标准分割结果靠近,最终训练出更准确地图像分割模型,辅助医生做出病灶诊断分析。
综上所述,本申请实施例提供的技术方案中,通过经源域样本预训练的图像分割模型,提取源域图像和目标域图像的预测分割结果,进一步将源域图像和目标域图像的预测分割结果输入至第一判别器,将源域图像的预测分割结果和源域图像的标准分割结果输入至第二判别器,采用对抗学习思想对预训练的图像分割模型进行迭代循环训练,直至模型收敛,得到完成训练的图像分割模型。本申请提供的技术方案将源域图像和目标域图像在输出空间进行对齐,使得完成训练的图像分割模型可以在输出空间减小源域图像和目标域图像之间的差别,降低所训练的图像分割模型在对目标域分割的误差,进一步使得目标域图像的分割结果更加准确。
另外,在本申请实施例中,在第一判别器的基础上,通过第二判别器进一步训练图像分割模型,使得图像分割模型预测出的源域图像的分割结果和目标域图像的分割结果尽可能的与源域图像的标准分割结果靠近,进一步提升模型的精度。
请参考图3,其示出了本申请另一个实施例提供的图像分割模型的训练方法的流程图。该方法可以包括以下几个步骤(301~312):
步骤301,采用源域样本对初始的图像分割模型进行训练,得到预训练的图像分割模型。
上述源域样本包括源域图像和所述源域图像的标准分割结果。
在本申请实施例中,上述图像分割模型可以是DeepLabv3+模型。DeepLabv3+模型中包括ASPP(Atrous Spatial Pyramid Pooling,空间金字塔池化)模块和编码-解码(encoder-decoder)结构,融合了两者的优点;ASPP可以通过池化操作在多个不同的比率和不同的感受野来编码数据中不同尺度的纹理信息,而编码-解码结构可以通过逐步的恢复空间信息来获取物体更加清晰的边界信息。
在一些其它实施例中,上述图像分割模型也可以是DeepLabv2模型、RefineNet模型、ResNet模型等等,本公开实施例对此不作限定。
步骤302,通过预训练的图像分割模型,提取源域图像的预测分割结果和目标域图像的预测分割结果。
在一种可能的实施方式中,计算机设备在使用源域样本训练得到预训练的图像分割模型之后,再次将该源域样本输入至预训练的图像分割模型,以提取该源域图像的预测分割结果。
在另一种可能的实施方式中,在上述步骤301中,源域样本包括第一样本集和第二样本集,计算机设备可以使用第一样本集对初始的图像分割模型进行训练,以得到预训练的图像分割模型;之后,可以使用第二样本集对预训练的图像分割模型进行再训练。此时,预训练的图像分割模型的输入是源域样本中的第二样本集,则提取的是该第二样本集中源域图像的预测分割结果。
关于目标域图像已在图1实施例中进行了介绍,此处不再赘述。将目标域图像输入至预训练的图像分割模型,以便提取该目标域图像的预测分割结果。
步骤303,将源域图像的预测分割结果和目标域图像的预测分割结果分别输入至第一判别器,得到第一判别器的判别结果。
上述第一判别器用于判别输入的分割结果来自于源域还是目标域,即该第一判别器执行二分类任务。例如,第一判别器的结果可以是0或1,当结果为0时,表示输入的分割结果是来自于源域;当结果为1时,表示输入的分割结果是来自于目标域。
步骤304,根据第一判别器的判别结果,计算第一判别器的判别损失函数的值。
其中,第一判别器的判别损失函数用于衡量第一判别器的判别准确度。
上述第一判别器的判别损失函数L D1(P T)可以表示为:
L D1(P T)=-∑ h,w(1-z)log(D(P T) (h,w,0))+z log(D(P T) (h,w,1));
其中,z为常数,当z=1时表示图像为目标域图像,z=0时表示图像为源域图像。
P T为目标域图像的预测分割结果,其可以表示为:
P T=G Seg(X T);
其中,G seg表示图像分割模型;X T表示目标域图像;且P T∈R H×W×C,上述H和W分别表示目标域图像的预测分割的高和宽,C表示分割的类目。
步骤305,通过最小化第一判别器的判别损失函数的值,来调整第一判别器的参数。
由于判别损失函数的值反映了第一判别器的判别准确度,且与判别准确度成反比,也即,判别损失函数的值越小,第一判别器判别准确度越高。因此,在训练过程中,计算机设备可以通过最小化该判别损失函数的值,来调整第一判别器的参数,使得第一判别器尽可能准确地判别出输入的分割结果是来自于源域还是目标域。
步骤306,根据第一判别器针对目标域图像的预测分割结果的判别结果,计算第一判别器的对抗损失函数的值。
其中,第一判别器的对抗损失函数用于衡量目标域图像的预测分割结果与源域图像的预测分割结果之间的差异程度。
上述对抗损失函数L Adv1(X T)可以表示为:
L Adv1(X T)=-∑ h,wL MAE(D(P T) (h,w,1),z);
其中,X T表示目标域图像,L MAE表示平均绝对误差函数,z=0表示输入至判别器的分割结果是来自源域。
步骤307,将源域图像的预测分割结果和源域图像的标准分割结果分别输入至第二判别器,得到第二判别器的判别结果。
上述第二判别器用于判别输入的分割结果为预测分割结果还是标准分割结果,该第二判别器同样执行二分类任务。例如,第二判别器的结果可以是0至1之间的数,当结果为0时,表示输入的分割结果是来自于源域;当结果为1时,表示输入的分割结果是源域图像的标准分割结果。
步骤308,根据第二判别器的判别结果,计算第二判别器的判别损失函数的值。
其中,第二判别器的判别损失函数用于衡量第二判别器的判别准确度。
上述第二判别器的判别损失函数L D2(P S)可以表示为:
L D2(P S)=-∑ h,w(1-u)log(D(P S) (h,w,0))+u log(D(P S) (h,w,1));
其中,P S表示源域图像的预测分割结果,u为常数,当u=1时表示图像为目标域图像,u=0时表示图像为源域图像。
上述源域图像的预测分割结果P S可以表示为:
P S=G Seg(X S);
其中,G seg表示图像分割模型;X S表示源域图像。
步骤309,通过最小化第二判别器的判别损失函数的值,来调整第二判别器的参数。
由于判别损失函数的值反映了第二判别器的判别准确度,且与判别准确度成反比。因此,在训练过程中,可以通过最小化该判别损失函数的值,来调整第二判别器的参数,使得第二判别器尽可能准确地判别出输入的分割结果是源域图像的分割结果还是源域图像的标准分割结果。
步骤310,根据第二判别器针对源域图像的预测分割结果的判别结果,计算第二判别器的对抗损失函数的值。
其中,第二判别器的对抗损失函数用于衡量源域图像的预测分割结果与源域图像的标准分割结果之间的差异程度。
上述第二判别器的对抗损失函数L Adv2(X S)可以表示为
L Adv2(X S)=-∑ h,wL MAE(D(P S) (h,w,1),u);
其中,X S表示源域图像,L MAE表示平均绝对误差函数,u=1表示输入至判别器的分割结果是源域图像的标准分割结果。
步骤311,根据预训练的图像分割模型的损失函数、第一判别器的对抗损失函数以及第二判别器的对抗损失函数,构建目标函数。
上述预训练的图像分割模型的损失函数L Seg(X S)可以采用交叉熵(Cross Entropy,CE)损失函数,可以表示为:
L Seg(X S)=-∑ h,wc∈CY S (h,w,c)log(P S (h,w,c));
其中,X S表示源域图像,Y S表示源域图像的标准分割结果。
上述图像分割模型训练的目标函数可以表示为:
Figure PCTCN2020091455-appb-000001
其中,α Seg、α Adv1和α Adv2是用来平衡训练过程中的图像分割模型的损失函数、第一判别器的对抗损失函数和第二判别器的对抗损失函数的调控参数。
Figure PCTCN2020091455-appb-000002
表示最大化第一判别器的判别损失函数以及第二判别器的判别损失函数的值,
Figure PCTCN2020091455-appb-000003
表示最小化图像分割模型的损失函数、第一判别器的对抗损失函数和第二判别器的对抗损失函数的加权和的值。
步骤312,通过最小化图像分割模型的损失函数的值、第一判别器的对抗损失函数和第二判别器的对抗损失函数的加权和的值,并最大化第一判别器的判别损失函数以及第二判别器的判别损失函数的值,来调整预训练的图像分割模型的参数,得到完成训练的图像分割模型。
计算机设备在获取到图像分割模型的损失函数的值,并将第一判别器的对抗损失函数以及第二判别器的对抗损失函数的值反馈至图像分割网络之后;图像分割网络通过调整其参数,以最小化上述图像分割模型的损失函数的值、第一判别器的对抗损失函数和第二判别器的对抗损失函数的加权和的值,并最大化第一判别器的判别损失函数以及第二判别器的判别损失函数的值;通过分割网络和判别网络的对抗训练使得图像分割模型预测出的源域图像的分割结果和目标域图像的分割结果尽可能的与源域图像的标准分割结果靠近。
由于最小化第一判别器的对抗损失函数,使得源域图像的分割结果逐渐与目标域图像的分割结果靠近,此时源域图像的分割结果会逐渐远离源域图像的标准分割结果,即分割模型对于源域图像的分割精度降低。此时,通过最小化第二判别器的对抗损失函数,使得源域图像的分割结果逐渐与源域图像的标准分割结果靠近,进一步使得图像分割模型预测出的源域图像的分割结果和目标域图像的分割结果尽可能的与源域图像的标准分割结果靠近。
计算机设备在图像分割模型满足停止训练条件时,停止对该模型的训练,得到完成训练的图像分割模型。该完成训练的图像分割模型对于目标域图像的分割结果与标准分割结果更相似。其中,图像分割模型的停止训练条件可以预先进行设定,如损失函数的值达到预设阈值、训练轮数达到预设轮数或训练时长达到预设时长等,本申请实施例对此不作限定。
可选地,计算机设备在将源域图像和目标域图像输入至图像分割模型之前,对源域图像和目标域图像进行归一化处理,得到处理后的源域图像和处理后的目标域图像,例如,将源域图像和目标域图像中每个像素点的像素值归一化到[-1,1]之间;上述处理后的源域图像和处理后的目标域图像用于图像分割模型的训练。
需要说明的一点是,在本申请实施例中,上述第一判别器和第二判别器共享参数。在一个示例中,第一判别器和第二判别器的参数实时共享。例如,在每一轮训练过程中,当第一判别器的参数进行更新后,将更新后的参数同步给第二判别器,第二判别器以同步后的参数进行训练,再次更新参数,并将该再次更新后的参数同步给第一判别器。通过在训练过程中,第一判别器和第二判别器实时共享参数,有助于提高模型的训练效率。
在一些其它示例中,第一判别器和第二判别器只在训练初始共享参数,之后各自独立进行参数更新。在这种情况下,可以先训练第一判别器,然后训练第二判别器;也可以先训练 第二判别器,再训练第一判别器;还可以同时训练第一判别器和第二判别器,本申请实施例对此不作限定。
另外,图像分割网络、第一判别器和第二判别器的初始学习率为预设值。例如,图像分割网络、第一判别器和第二判别器的初始学习率分别为1.5×10 -5、1×10 -5和1×10 -5
综上所述,本申请实施例提供的技术方案,通过将源域图像的分割结果和目标域图像的分割结果输入至第一判别器,将源域图像的分割结果以及源域图像的标准分割结果输入至第二判别器,得到第一判别器的判别损失函数和对抗损失函数,以及第二判别器的判别损失函数和对抗损失函数;之后,将预训练的图像分割模型的损失函数、第一判别器的对抗损失函数以及第二判别器的对抗损失函数反馈至图像分割模型,通过最小化图像分割模型的损失函数的值、第一判别器的对抗损失函数和第二判别器的对抗损失函数的加权和的值,并最大化第一判别器的判别损失函数以及第二判别器的判别损失函数的值,来调整预训练的图像分割模型的参数,得到完成训练的图像分割模型。该完成训练的图像分割模型能够准确分割来自目标域的图像,具有良好的分割性能和泛化能力。
另外,在本申请实施例中,在将源域图像和目标域图像输入至图像分割模型之前,对源域图像和目标域图像进行归一化处理,使得输入的图像与判别器的判别结果处在同一维度上,进一步更好的训练和优化图像分割模型。
结合参考图4,其示例性示出了不同分割方式下的分割结果的示意图。其中,(a)表示源域图像;(b)表示目标域图像;(c)表示只使用源域图像训练图像分割模型得到的图像分割模型,对源域图像的分割结果;(d)表示使用源域图像和源域图像的标准分割结果训练,且没有进行领域自适应训练得到的图像分割模型,对目标域图像的分割结果;(e)表示采用本方案提供的训练方法得到的完成训练的图像分割模型,对目标域图像的分割结果;(f)表示目标域图像的标准分割结果。从图4中可以看出,采用本方案提供的图像分割模型训练方法训练完成的图像分割模型,能够准确分割出目标区域,具有良好的分割性能。
在得到上述完成训练的图像分割模型之后,可以将该完成训练的图像分割模型部署在计算机设备中,当计算机设备获取到来自目标域的待分割图像时,调用该完成训练的图像分割模型,以准确分割出图像中的目标区域,得到待分割图像的分割结果。以计算机设备为医院中辅助诊断平台为例,将完成训练的图像分割模型部署在该辅助诊断平台中,辅助诊断平台可以在采集到病人的医疗图像时,直接分割出病灶区域的准确的分布信息,以方便医生准确做出诊断。
下面,通过将本方案在三个不同图像数据集上测试,进一步描述本方案的有益效果:
三个数据集分别是BRATS(Brain Tumor Segmentation,脑肿瘤分割)2018数据集、脑胶质瘤私有数据集和多中心的SCGM(Spinal Cord Gray Matter,脊髓灰质分割)2017数据集。
其中,BRATS 2018数据集包括285个带有标签集的样本,每个样本有4个模态,分别是FLAIR(FLuid Attenuated Inversion Recovery,液体衰减反转恢复),T1增强、T1 MRI和T2  MRI。对上述数据预处理包括头骨剥离、配准和重采样到1×1×1mm 3的分辨率,每个样本的维度为240×240×155。在测试过程中,仅使用该数据集中T2 MRI的数据集,并将3D T2 MRI轴视图转换为多层2D图像。
脑胶质瘤私有数据集包括200个带有标签集的样本,每个样本仅有厚层3D T2 MRI的数据集,且标签集仅标注了肿瘤水肿区(即肿瘤全区)。由于该数据集是厚层扫描的,即仅有轴状图是结构清晰的,另外两个视图(即冠状图和矢状图)的图像是非常模糊的。因此在测试过程中,仅使用轴状面,并将3D T2 MRI轴视图转换为多层2D图像,且2D图像被重采样到513×513的大小,另外,对上述数据预处理仅是头骨剥离。
SCGM 2017数据集包括4个不同中心的数据,总共40个带有标签集的样本。数据的维度从0.25×0.25×0.25mm 3到0.5×0.5×0.5mm 3不等,并将3D T2 MRI轴视图转换为多层2D图像。
在脑肿瘤分割任务中,设计了两种测试方案:测试1,将BRATS 2018数据作为源域数据,脑胶质瘤私有数据作为目标域数据;测试2,将脑胶质瘤私有数据作为源域数据,BRATS 2018数据作为目标域数据。另外,在使用DeepLabv3+作为分割模型的基础上,还使用DeepLabv2作为分割模型进行对比。此外,将本方案的在输出空间进行ADDA(Adversarial Discriminative Domain Adaptation,无监督领域自适应)分割算法与在特征空间进行ADDA分割算法进行对比。
如表-1所示,其示出了在脑肿瘤分割任务中测试1和测试2的测试结果。
Figure PCTCN2020091455-appb-000004
表-1
其中,第一行是图像分割的衡量指标;其中,Dice系数(Dice Score)用于度量两个集合的相似性;灵敏度(Sensitivity)表示所有测试结果中分割准确的结果的比例;特异性(Specificity)表示;霍斯多夫距离(Hausdorff Distance)是在度量空间中任意两个集合之间定义的一种距离。上述Dice系数、灵敏度和特异性与图像分割模型的准确度成正比,斯多夫距离与图像分割模型的准确度成反比。第二行的P表示将脑胶质瘤私有数据作为源域数据,BRATS 2018数据作为目标域数据,第二行的B表示将BRATS 2018数据作为源域数据,脑胶质瘤私有数据作为目标域数据。第三行至第五行分别表示使用DeepLabv3+和DeepLabv2分割模型,以及ADDA分割算法得到的测试结果。第六行 1Ours表示本方案中使用DeepLabv2作为分割模型得到的测试结果。第七行 2Ours表示本方案中使用DeepLabv3+作为分割模型得到的测试结果。从表-1中可以看出,本方案的测试结果相比于上述三种相关分割模型和分割算 法更加准确。另外,对比第六行和第七行的测试结果,可以看出本方案中使用DeepLabv3+作为分割模型可以得到更好的分割结果。
结合参考图5,其示出了不同分割方式下脑肿瘤分割结果的示例图。P行表示将脑胶质瘤私有数据作为源域数据,BRATS 2018数据作为目标域数据得到的测试结果,B行表示将BRATS 2018数据作为源域数据,脑胶质瘤私有数据作为目标域数据得到的测试结果。第一列Axial表示数据轴状图;第二列GT(Ground Truth,分割标准)图;第三列表示采用DeepLabv3+作为分割模型,且进行DA(Domain Adaptation,领域自适应)后得到的测试结果,也即本方案的测试结果;第四列表示只采用DeepLabv3+作为分割模型,不进行DA后得到的测试结果;第五列表示采用DeepLabv2作为分割模型,且进行DA后得到的测试结果;第六列表示只采用DeepLabv3+作为分割模型,不进行DA得到的测试结果。从图5中可以直观的看出,本申请技术方案能够准确分割目标域图像。
在脊髓灰质分割任务中,参考相关技术方案,设计了两种测试方案:测试1,将中心1和中心2的数据作为源域数据,将中心3的数据作为目标域数据;测试2,将中心1和2的数据作为源域数据,将中心4的数据作为目标域数据。并与相关的两个技术方案,即采用EMA(Exponential Moving Average,指数滑动平均)分割模型和UDASE(Unsupervised Domain Adaptation with Self-Ensembling,自整合无监督领域适应)分割模型的分割结果进行对比。本申请中设计的两种测试方案与相关技术中提供的测试方案相同,以将本申请提供的技术方案的效果与相关技术提供的技术方案的效果进行对比。
如表-2所示,其示出了在脊髓灰质分割任务中测试1和测试2的测试结果。
Figure PCTCN2020091455-appb-000005
表-2
其中,DeepLabv3+所在行表示使用DeepLabv3+作为分割模型得到的测试结果;EMA和UDASE所在行为相关技术的测试结果;Ours所在行表示本方案的测试结果。从表-2中可以看出,对于测试2,即由中心1和2的作为源域数据适应到中心4的测试中,本方案分割模型的分割性能显著优于相关技术提供的方案。对于测试1,由中心1和2的作为源域数据适应到中心3的测试中,本方案分割模型的分割性能与仅使用DeepLabv3+进行分割,没有进行自适应的测试结果相比,本方案能显著提升目标域数据的分割性能。
结合参考图6,其示出了不同分割方式下脊髓灰质分割结果的示例图。其中,第一列和 第四列分别表示测试1和测试2中采用相关技术得到的分割结果;第二列和第五列分别表示测试1和测试2中采用DeepLabv3+作为分割模型,且进行DA后得到的测试结果,也即本方案的测试结果;第三列和第六列分别表示测试1和测试2中只采用DeepLabv3+作为分割模型,不进行DA得到的测试结果。从图6中可以看出,本申请技术方案可以更加直观的看出,本申请技术方案能够准确分割目标域图像,且目标域图像在输出空间输出的图像视觉信息更加清楚和准确。
综上所述,本申请提供的技术方案,在使用DeepLabv3+作为分割模型,且在输出空间进行领域自适应之后,提升最终训练得到的图像分割模型的分割性能以及泛化能力,使得图像分割模型能够准确分割目标域图像,且目标域图像的分割结果更加准确。
应该理解的是,虽然图1、3的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图1、3中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交底地执行。
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图7,其示出了本申请一个实施例提供的图像分割模型的训练装置的框图。该装置具有实现上述图像分割模型的训练方法示例的功能,所述功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置可以是上文介绍的计算机设备,也可以设置在计算机设备上。该装置700可以包括:第一训练模块710、结果提取模块720、第二训练模块730、第三训练模块740和第四训练模块750。
第一训练模块710,用于采用源域样本对初始的图像分割模型进行训练,得到预训练的图像分割模型,所述源域样本包括源域图像和所述源域图像的标准分割结果。
结果提取模块720,用于通过所述预训练的图像分割模型,提取所述源域图像的预测分割结果和目标域图像的预测分割结果。
第二训练模块730,用于采用所述源域图像的预测分割结果和所述目标域图像的预测分割结果,对第一判别器进行训练;其中,所述第一判别器用于判别输入的分割结果来自于源域还是目标域。
第三训练模块740,用于采用所述源域图像的预测分割结果和所述源域图像的标准分割结果,对第二判别器进行训练;其中,所述第二判别器用于判别输入的分割结果为预测分割结果还是标准分割结果。
第四训练模块750,用于根据所述预训练的图像分割模型的损失函数、所述第一判别器的对抗损失函数以及所述第二判别器的对抗损失函数,对所述预训练的图像分割模型进行再训练,如此迭代循环训练直至收敛得到完成训练的图像分割模型。
综上所述,本申请实施例提供的技术方案中,通过经源域样本预训练的图像分割模型,提取源域图像和目标域图像的预测分割结果,进一步将源域图像和目标域图像的预测分割结果输入至第一判别器,将源域图像的预测分割结果和源域图像的标准分割结果输入至第二判别器,采用对抗学习思想对预训练的图像分割模型进行再训练,如此迭代循环训练直至模型收敛,得到完成训练的图像分割模型。本申请提供的技术方案将源域图像和目标域图像在输出空间进行对齐,使得完成训练的图像分割模型可以在输出空间减小源域图像和目标域图像之间的差别,降低所训练的图像分割模型在对目标域分割的误差,进一步使得目标域图像的分割结果更加准确。
在一些可能的设计中,所述第二训练模块730,用于将所述源域图像的预测分割结果和所述目标域图像的预测分割结果分别输入至所述第一判别器,得到所述第一判别器的判别结果;根据所述第一判别器的判别结果,计算所述第一判别器的判别损失函数的值,其中,所述第一判别器的判别损失函数用于衡量所述第一判别器的判别准确度;通过最小化所述第一判别器的判别损失函数的值,来调整所述第一判别器的参数。
在一些可能的设计中,结合参考图8,所述装置700还包括:第一计算模块760。
第一计算模块760,用于根据所述第一判别器针对所述目标域图像的预测分割结果的判别结果,计算所述第一判别器的对抗损失函数的值;其中,所述第一判别器的对抗损失函数用于衡量所述目标域图像的预测分割结果与所述源域图像的预测分割结果之间的差异程度。
在一些可能的设计中,所述第三训练模块740,用于将所述源域图像的预测分割结果和所述源域图像的标准分割结果分别输入至所述第二判别器,得到所述第二判别器的判别结果;根据所述第二判别器的判别结果,计算所述第二判别器的判别损失函数的值,其中,所述第二判别器的判别损失函数用于衡量所述第二判别器的判别准确度;通过最小化所述第二判别器的判别损失函数的值,来调整所述第二判别器的参数。
在一些可能的设计中,结合参考图8,所述装置700还包括:第二计算模块770。
第二计算模块770,用于根据所述第二判别器针对所述源域图像的预测分割结果的判别结果,计算所述第二判别器的对抗损失函数的值;其中,所述第二判别器的对抗损失函数用于衡量所述源域图像的预测分割结果与所述源域图像的标准分割结果之间的差异程度。
在一些可能的设计中,所述第四训练模块750,用于根据所述预训练的图像分割模型的损失函数、所述第一判别器的对抗损失函数以及所述第二判别器的对抗损失函数,构建目标函数;通过最小化所述图像分割模型的损失函数、所述第一判别器的对抗损失函数和所述第二判别器的对抗损失函数的加权和的值,并最大化所述第一判别器的判别损失函数的值和所述第二判别器的判别损失函数的值,来调整所述预训练的图像分割模型的参数,得到所述完成训练的图像分割模型。
在一些可能的设计中,所述第一判别器和所述第二判别器共享参数。
在一些可能的设计中,结合参考图8,所述装置700还包括:图像处理模块780。
图像处理模块780,用于对所述源域图像和所述目标域图像进行归一化处理,得到处理后的源域图像和处理后的目标域图像;其中,所述处理后的源域图像和所述处理后的目标域 图像用于所述图像分割模型的训练。
此外,本申请一个实施例提供的图像分割装置。该装置具有实现上述图像分割方法示例的功能,所述功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置可以是上文介绍的计算机设备,也可以设置在计算机设备上。该装置可以包括:获取模块和调用模块。
获取模块,用于获取来自目标域的待分割图像;
调用模块,用于调用完成训练的图像分割模型处理所述待分割图像,得到所述待分割图像的分割结果,所述完成训练的图像分割模型是通过第一判别器和第二判别器,在输出空间采用对抗学习对图像分割模型进行训练得到的;
其中,所述第一判别器用于在训练所述图像分割模型的过程中,减小目标域图像的预测分割结果与源域图像的预测分割结果之间的差异;所述第二判别器用于在训练所述图像分割模型的过程中,减小所述源域图像的预测分割结果与所述源域图像的标准分割结果之间的差异。
需要说明的是,上述实施例提供的装置,在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
请参考图9,其示出了本申请一个实施例提供的计算机设备的结构示意图。该计算机设备可以是任何具备数据处理和存储功能的电子设备,如PC或服务器。该计算机设备用于实施上述实施例中提供的图像分割模型的训练方法。具体来讲:
所述计算机设备900包括中央处理单元(CPU)901、包括随机存取存储器(RAM)902和只读存储器(ROM)903的***存储器904,以及连接***存储器904和中央处理单元901的***总线905。所述计算机设备900还包括帮助计算机内的各个器件之间传输信息的基本输入/输出***(I/O***)906,和用于存储操作***913、应用程序914和其他程序模块912的大容量存储设备907。
所述基本输入/输出***906包括有用于显示信息的显示器908和用于用户输入信息的诸如鼠标、键盘之类的输入设备909。其中所述显示器908和输入设备909都通过连接到***总线905的输入输出控制器910连接到中央处理单元901。所述基本输入/输出***906还可以包括输入输出控制器910以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其他设备的输入。类似地,输入输出控制器910还提供输出到显示屏、打印机或其他类型的输出设备。
所述大容量存储设备907通过连接到***总线905的大容量存储控制器(未示出)连接到中央处理单元901。所述大容量存储设备907及其相关联的计算机可读介质为计算机设备900提供非易失性存储。也就是说,所述大容量存储设备907可以包括诸如硬盘或者CD-ROM驱动器之类的计算机可读介质(未示出)。
不失一般性,所述计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。当然,本领域技术人员可知所述计算机存储介质不局限于上述几种。上述的***存储器904和大容量存储设备907可以统称为存储器。
根据本申请的各种实施例,所述计算机设备900还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即计算机设备900可以通过连接在所述***总线905上的网络接口单元911连接到网络912,或者说,也可以使用网络接口单元911来连接到其他类型的网络或远程计算机***(未示出)。
所述存储器还包括至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、至少一段程序、代码集或指令集存储于存储器中,且经配置以由一个或者一个以上处理器执行,以实现上述图像分割模型的训练方法,或者图像分割方法。
在示例性实施例中,还提供了一种计算机设备。该计算机设备可以是终端或计算机设备。所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现上述图像分割模型的训练方法,或者图像分割方法。
在示例性实施例中,还提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或所述指令集在被处理器执行时实现上述图像分割模型的训练方法,或者图像分割方法。
在示例性实施例中,还提供了一种计算机程序产品,当该计算机程序产品被处理器执行时,其用于实现上述实施例提供的图像分割模型的训练方法,或者图像分割方法。
应当理解的是,在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各 个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。以上实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (16)

  1. 一种图像分割模型的训练方法,由计算机设备执行,其特征在于,所述方法包括:
    采用源域样本对初始的图像分割模型进行训练,得到预训练的图像分割模型,所述源域样本包括源域图像和所述源域图像的标准分割结果;
    通过所述预训练的图像分割模型,提取所述源域图像的预测分割结果和目标域图像的预测分割结果;
    采用所述源域图像的预测分割结果和所述目标域图像的预测分割结果,对第一判别器进行训练;其中,所述第一判别器用于判别输入的分割结果来自于源域还是目标域;
    采用所述源域图像的预测分割结果和所述源域图像的标准分割结果,对第二判别器进行训练;其中,所述第二判别器用于判别输入的分割结果为预测分割结果还是标准分割结果;
    根据所述预训练的图像分割模型的损失函数、所述第一判别器的对抗损失函数以及所述第二判别器的对抗损失函数,对所述预训练的图像分割模型进行再训练,如此迭代循环训练直至收敛得到完成训练的图像分割模型。
  2. 根据权利要求1所述的方法,其特征在于,所述采用所述源域图像的预测分割结果和所述目标域图像的预测分割结果,对第一判别器进行训练,包括:
    将所述源域图像的预测分割结果和所述目标域图像的预测分割结果分别输入至所述第一判别器,得到所述第一判别器的判别结果;
    根据所述第一判别器的判别结果,计算所述第一判别器的判别损失函数的值,其中,所述第一判别器的判别损失函数用于衡量所述第一判别器的判别准确度;
    通过最小化所述第一判别器的判别损失函数的值,来调整所述第一判别器的参数。
  3. 根据权利要求2所述的方法,其特征在于,所述将所述源域图像的预测分割结果和所述目标域图像的预测分割结果分别输入至所述第一判别器,得到所述第一判别器的判别结果之后,还包括:
    根据所述第一判别器针对所述目标域图像的预测分割结果的判别结果,计算所述第一判别器的对抗损失函数的值;
    其中,所述第一判别器的对抗损失函数用于衡量所述目标域图像的预测分割结果与所述源域图像的预测分割结果之间的差异程度。
  4. 根据权利要求1所述的方法,其特征在于,所述采用所述源域图像的预测分割结果和所述源域图像的标准分割结果,对第二判别器进行训练,包括:
    将所述源域图像的预测分割结果和所述源域图像的标准分割结果分别输入至所述第二判别器,得到所述第二判别器的判别结果;
    根据所述第二判别器的判别结果,计算所述第二判别器的判别损失函数的值,其中,所述第二判别器的判别损失函数用于衡量所述第二判别器的判别准确度;
    通过最小化所述第二判别器的判别损失函数的值,来调整所述第二判别器的参数。
  5. 根据权利要求4所述的方法,其特征在于,所述将所述源域图像的预测分割结果和所 述源域图像的标准分割结果分别输入至所述第二判别器,得到所述第二判别器的判别结果之后,还包括:
    根据所述第二判别器针对所述源域图像的预测分割结果的判别结果,计算所述第二判别器的对抗损失函数的值;
    其中,所述第二判别器的对抗损失函数用于衡量所述源域图像的预测分割结果与所述源域图像的标准分割结果之间的差异程度。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述根据所述预训练的图像分割模型的损失函数、所述第一判别器的对抗损失函数以及所述第二判别器的对抗损失函数,对所述预训练的图像分割模型进行再训练,如此迭代循环训练直至收敛得到完成训练的图像分割模型,包括:
    根据所述预训练的图像分割模型的损失函数、所述第一判别器的对抗损失函数以及所述第二判别器的对抗损失函数,构建目标函数;
    通过最小化所述图像分割模型的损失函数、所述第一判别器的对抗损失函数和所述第二判别器的对抗损失函数的加权和的值,并最大化所述第一判别器的判别损失函数的值和所述第二判别器的判别损失函数的值,来调整所述预训练的图像分割模型的参数,得到所述完成训练的图像分割模型。
  7. 根据权利要求1至5任一项所述的方法,其特征在于,所述第一判别器和所述第二判别器共享参数。
  8. 根据权利要求1至5任一项所述的方法,其特征在于,所述方法还包括:
    对所述源域图像和所述目标域图像进行归一化处理,得到处理后的源域图像和处理后的目标域图像;
    其中,所述处理后的源域图像和所述处理后的目标域图像用于所述图像分割模型的训练。
  9. 一种图像分割方法,由计算机设备执行,其特征在于,所述方法包括:
    获取来自目标域的待分割图像;
    调用完成训练的图像分割模型处理所述待分割图像,得到所述待分割图像的分割结果,所述完成训练的图像分割模型是通过第一判别器和第二判别器,在输出空间采用对抗学习对图像分割模型进行训练得到的;
    其中,所述第一判别器用于在训练所述图像分割模型的过程中,减小目标域图像的预测分割结果与源域图像的预测分割结果之间的差异;所述第二判别器用于在训练所述图像分割模型的过程中,减小所述源域图像的预测分割结果与所述源域图像的标准分割结果之间的差异。
  10. 一种图像分割模型的训练装置,其特征在于,所述装置包括:
    第一训练模块,用于采用源域样本对初始的图像分割模型进行训练,得到预训练的图像分割模型,所述源域样本包括源域图像和所述源域图像的标准分割结果;
    结果提取模块,用于通过所述预训练的图像分割模型,提取所述源域图像的预测分割结 果和目标域图像的预测分割结果;
    第二训练模块,用于采用所述源域图像的预测分割结果和所述目标域图像的预测分割结果,对第一判别器进行训练;其中,所述第一判别器用于判别输入的分割结果来自于源域还是目标域;
    第三训练模块,用于采用所述源域图像的预测分割结果和所述源域图像的标准分割结果,对第二判别器进行训练;其中,所述第二判别器用于判别输入的分割结果为预测分割结果还是标准分割结果;
    第四训练模块,用于根据所述预训练的图像分割模型的损失函数、所述第一判别器的对抗损失函数以及所述第二判别器的对抗损失函数,对所述预训练的图像分割模型进行再训练,如此迭代循环训练直至收敛得到完成训练的图像分割模型。
  11. 根据权利要求10所述的装置,其特征在于,所述第二训练模块,用于:
    将所述源域图像的预测分割结果和所述目标域图像的预测分割结果分别输入至所述第一判别器,得到所述第一判别器的判别结果;
    根据所述第一判别器的判别结果,计算所述第一判别器的判别损失函数的值,其中,所述第一判别器的判别损失函数用于衡量所述第一判别器的判别准确度;
    通过最小化所述第一判别器的判别损失函数的值,来调整所述第一判别器的参数。
  12. 根据权利要求11所述的装置,其特征在于,所述装置还包括:
    第一计算模块,用于根据所述第一判别器针对所述目标域图像的预测分割结果的判别结果,计算所述第一判别器的对抗损失函数的值;
    其中,所述第一判别器的对抗损失函数用于衡量所述目标域图像的预测分割结果与所述源域图像的预测分割结果之间的差异程度。
  13. 根据权利要求10所述的装置,其特征在于,所述第三训练模块,用于:
    将所述源域图像的预测分割结果和所述源域图像的标准分割结果分别输入至所述第二判别器,得到所述第二判别器的判别结果;
    根据所述第二判别器的判别结果,计算所述第二判别器的判别损失函数的值,其中,所述第二判别器的判别损失函数用于衡量所述第二判别器的判别准确度;
    通过最小化所述第二判别器的判别损失函数的值,来调整所述第二判别器的参数。
  14. 一种图像分割装置,其特征在于,所述装置包括:
    获取模块,用于获取来自目标域的待分割图像;
    调用模块,用于调用完成训练的图像分割模型处理所述待分割图像,得到所述待分割图像的分割结果,所述完成训练的图像分割模型是通过第一判别器和第二判别器,在输出空间采用对抗学习对图像分割模型进行训练得到的;
    其中,所述第一判别器用于在训练所述图像分割模型的过程中,减小目标域图像的预测分割结果与源域图像的预测分割结果之间的差异;所述第二判别器用于在训练所述图像分割模型的过程中,减小所述源域图像的预测分割结果与所述源域图像的标准分割结果之间的差 异。
  15. 一种计算机设备,其特征在于,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如权利要求1至8任一项所述的方法,或者实现如权利要求9所述的方法。
  16. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如权利要求1至8任一项所述的方法,或者实现如权利要求9所述的方法。
PCT/CN2020/091455 2019-05-27 2020-05-21 图像分割模型的训练方法、装置、计算机设备和存储介质 WO2020238734A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20815327.0A EP3979198A4 (en) 2019-05-27 2020-05-21 IMAGE SEGMENTATION MODEL TRAINING METHOD AND INSTALLATION, COMPUTER DEVICE AND STORAGE MEDIA
US17/470,433 US11961233B2 (en) 2019-05-27 2021-09-09 Method and apparatus for training image segmentation model, computer device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910448095.6A CN110148142B (zh) 2019-05-27 2019-05-27 图像分割模型的训练方法、装置、设备和存储介质
CN201910448095.6 2019-05-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/470,433 Continuation US11961233B2 (en) 2019-05-27 2021-09-09 Method and apparatus for training image segmentation model, computer device, and storage medium

Publications (1)

Publication Number Publication Date
WO2020238734A1 true WO2020238734A1 (zh) 2020-12-03

Family

ID=67593307

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/091455 WO2020238734A1 (zh) 2019-05-27 2020-05-21 图像分割模型的训练方法、装置、计算机设备和存储介质

Country Status (4)

Country Link
US (1) US11961233B2 (zh)
EP (1) EP3979198A4 (zh)
CN (1) CN110148142B (zh)
WO (1) WO2020238734A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990218A (zh) * 2021-03-25 2021-06-18 北京百度网讯科技有限公司 图像语义分割模型的优化方法、装置和电子设备
CN113450351A (zh) * 2021-08-06 2021-09-28 推想医疗科技股份有限公司 分割模型训练方法、图像分割方法、装置、设备及介质
CN113673570A (zh) * 2021-07-21 2021-11-19 南京旭锐软件科技有限公司 电子器件图片分类模型的训练方法、装置及设备
CN113888547A (zh) * 2021-09-27 2022-01-04 太原理工大学 基于gan网络的无监督域自适应遥感道路语义分割方法
CN115050032A (zh) * 2022-05-02 2022-09-13 清华大学 一种基于特征对齐和熵正则化的域适应文本图像识别方法

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110148142B (zh) * 2019-05-27 2023-04-18 腾讯科技(深圳)有限公司 图像分割模型的训练方法、装置、设备和存储介质
CN112733864B (zh) * 2019-09-16 2023-10-31 北京迈格威科技有限公司 模型训练方法、目标检测方法、装置、设备及存储介质
US11455531B2 (en) * 2019-10-15 2022-09-27 Siemens Aktiengesellschaft Trustworthy predictions using deep neural networks based on adversarial calibration
CN110866931B (zh) * 2019-11-18 2022-11-01 东声(苏州)智能科技有限公司 图像分割模型训练方法及基于分类的强化图像分割方法
CN110880182B (zh) * 2019-11-18 2022-08-26 东声(苏州)智能科技有限公司 图像分割模型训练方法、图像分割方法、装置及电子设备
CN110956214B (zh) * 2019-12-03 2023-10-13 北京车和家信息技术有限公司 一种自动驾驶视觉定位模型的训练方法及装置
CN111178207B (zh) * 2019-12-20 2023-08-01 北京邮电大学 一种基于复平面坐标系定位的目标检测方法及装置
CN111209916B (zh) * 2019-12-31 2024-01-23 中国科学技术大学 病灶识别方法及***、识别设备
CN111199256B (zh) * 2020-01-02 2024-03-22 东软医疗***股份有限公司 图像优化网络的训练方法、图像处理方法及装置
TWI726574B (zh) * 2020-01-10 2021-05-01 宏碁股份有限公司 模型訓練方法與電子裝置
CN111340819B (zh) * 2020-02-10 2023-09-12 腾讯科技(深圳)有限公司 图像分割方法、装置和存储介质
CN111402278B (zh) * 2020-02-21 2023-10-27 华为云计算技术有限公司 分割模型训练方法、图像标注方法及相关装置
CN111444951B (zh) * 2020-03-24 2024-02-20 腾讯科技(深圳)有限公司 样本识别模型的生成方法、装置、计算机设备和存储介质
CN111444952B (zh) * 2020-03-24 2024-02-20 腾讯科技(深圳)有限公司 样本识别模型的生成方法、装置、计算机设备和存储介质
CN111199550B (zh) * 2020-04-09 2020-08-11 腾讯科技(深圳)有限公司 图像分割网络的训练方法、分割方法、装置和存储介质
US11600017B2 (en) * 2020-04-29 2023-03-07 Naver Corporation Adversarial scene adaptation for camera pose regression
CN111539947B (zh) * 2020-04-30 2024-03-29 上海商汤智能科技有限公司 图像检测方法及相关模型的训练方法和相关装置、设备
CN111709873B (zh) * 2020-05-27 2023-06-20 北京百度网讯科技有限公司 图像转换模型生成器的训练方法和装置
CN113807529A (zh) * 2020-07-31 2021-12-17 北京沃东天骏信息技术有限公司 机器学习模型的训练方法和装置、图像的分类方法和装置
CN112115976B (zh) * 2020-08-20 2023-12-08 北京嘀嘀无限科技发展有限公司 模型训练方法、模型训练装置、存储介质和电子设备
CN112070163B (zh) * 2020-09-09 2023-11-24 抖音视界有限公司 图像分割模型训练和图像分割方法、装置、设备
CN112330625B (zh) * 2020-11-03 2023-03-24 杭州迪英加科技有限公司 免疫组化核染色切片细胞定位多域共适应训练方法
CN112508974B (zh) * 2020-12-14 2024-06-11 北京达佳互联信息技术有限公司 图像分割模型的训练方法、装置、电子设备和存储介质
CN112633385A (zh) * 2020-12-25 2021-04-09 华为技术有限公司 一种模型训练的方法、数据生成的方法以及装置
CN112686913B (zh) * 2021-01-11 2022-06-10 天津大学 基于边界注意力一致性的目标边界检测和目标分割模型
CN112767463B (zh) * 2021-01-12 2024-02-06 深圳大学 一种对抗配准方法、装置、计算机设备及存储介质
CN112966687B (zh) * 2021-02-01 2024-01-19 深圳市优必选科技股份有限公司 图像分割模型训练方法、装置及通信设备
CN113657389A (zh) * 2021-07-29 2021-11-16 中国科学院软件研究所 一种软件定义卫星语义分割方法、装置和介质
CN113902913A (zh) * 2021-08-31 2022-01-07 际络科技(上海)有限公司 图片语义分割方法及装置
CN113870258B (zh) * 2021-12-01 2022-03-25 浙江大学 一种基于对抗学习的无标签胰腺影像自动分割***
CN115294400B (zh) * 2022-08-23 2023-03-31 北京医准智能科技有限公司 图像分类模型的训练方法、装置、电子设备及存储介质
CN115841475A (zh) * 2022-12-14 2023-03-24 北京医准智能科技有限公司 一种心脏图像分割方法、装置、设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170270671A1 (en) * 2016-03-16 2017-09-21 International Business Machines Corporation Joint segmentation and characteristics estimation in medical images
US20180137623A1 (en) * 2016-11-11 2018-05-17 Microsoft Technology Licensing, Llc Image segmentation using user input speed
CN108389210A (zh) * 2018-02-28 2018-08-10 深圳天琴医疗科技有限公司 一种医学图像分割方法及装置
CN109101975A (zh) * 2018-08-20 2018-12-28 电子科技大学 基于全卷积神经网络的图像语义分割方法
CN109241972A (zh) * 2018-08-20 2019-01-18 电子科技大学 基于深度学习的图像语义分割方法
CN109543502A (zh) * 2018-09-27 2019-03-29 天津大学 一种基于深度多尺度神经网络的语义分割方法
CN110148142A (zh) * 2019-05-27 2019-08-20 腾讯科技(深圳)有限公司 图像分割模型的训练方法、装置、设备和存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10970829B2 (en) * 2017-08-24 2021-04-06 Siemens Healthcare Gmbh Synthesizing and segmenting cross-domain medical images
US10643320B2 (en) * 2017-11-15 2020-05-05 Toyota Research Institute, Inc. Adversarial learning of photorealistic post-processing of simulation with privileged information
CN108062753B (zh) * 2017-12-29 2020-04-17 重庆理工大学 基于深度对抗学习的无监督域自适应脑肿瘤语义分割方法
CN109166095B (zh) * 2018-07-11 2021-06-25 广东技术师范学院 一种基于生成对抗机制的眼底影像杯盘分割方法
CN109299716B (zh) * 2018-08-07 2021-07-06 北京市商汤科技开发有限公司 神经网络的训练方法、图像分割方法、装置、设备及介质
CN109166126B (zh) * 2018-08-13 2022-02-18 苏州比格威医疗科技有限公司 一种基于条件生成式对抗网络在icga图像上分割漆裂纹的方法
CN109190707A (zh) * 2018-09-12 2019-01-11 深圳市唯特视科技有限公司 一种基于对抗学习的域自适应图像语义分割方法
CN109558901B (zh) * 2018-11-16 2022-04-05 北京市商汤科技开发有限公司 一种语义分割训练方法及装置、电子设备、存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170270671A1 (en) * 2016-03-16 2017-09-21 International Business Machines Corporation Joint segmentation and characteristics estimation in medical images
US20180137623A1 (en) * 2016-11-11 2018-05-17 Microsoft Technology Licensing, Llc Image segmentation using user input speed
CN108389210A (zh) * 2018-02-28 2018-08-10 深圳天琴医疗科技有限公司 一种医学图像分割方法及装置
CN109101975A (zh) * 2018-08-20 2018-12-28 电子科技大学 基于全卷积神经网络的图像语义分割方法
CN109241972A (zh) * 2018-08-20 2019-01-18 电子科技大学 基于深度学习的图像语义分割方法
CN109543502A (zh) * 2018-09-27 2019-03-29 天津大学 一种基于深度多尺度神经网络的语义分割方法
CN110148142A (zh) * 2019-05-27 2019-08-20 腾讯科技(深圳)有限公司 图像分割模型的训练方法、装置、设备和存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3979198A4 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990218A (zh) * 2021-03-25 2021-06-18 北京百度网讯科技有限公司 图像语义分割模型的优化方法、装置和电子设备
CN113673570A (zh) * 2021-07-21 2021-11-19 南京旭锐软件科技有限公司 电子器件图片分类模型的训练方法、装置及设备
CN113450351A (zh) * 2021-08-06 2021-09-28 推想医疗科技股份有限公司 分割模型训练方法、图像分割方法、装置、设备及介质
CN113450351B (zh) * 2021-08-06 2024-01-30 推想医疗科技股份有限公司 分割模型训练方法、图像分割方法、装置、设备及介质
CN113888547A (zh) * 2021-09-27 2022-01-04 太原理工大学 基于gan网络的无监督域自适应遥感道路语义分割方法
CN115050032A (zh) * 2022-05-02 2022-09-13 清华大学 一种基于特征对齐和熵正则化的域适应文本图像识别方法

Also Published As

Publication number Publication date
CN110148142A (zh) 2019-08-20
US11961233B2 (en) 2024-04-16
EP3979198A1 (en) 2022-04-06
US20210407086A1 (en) 2021-12-30
CN110148142B (zh) 2023-04-18
EP3979198A4 (en) 2022-07-27

Similar Documents

Publication Publication Date Title
WO2020238734A1 (zh) 图像分割模型的训练方法、装置、计算机设备和存储介质
Mahmood et al. Deep adversarial training for multi-organ nuclei segmentation in histopathology images
WO2021017372A1 (zh) 一种基于生成对抗网络的医学图像分割方法、***及电子设备
US20200167930A1 (en) A System and Computer-Implemented Method for Segmenting an Image
US10580159B2 (en) Coarse orientation detection in image data
US20230046321A1 (en) Medical image analysis using machine learning and an anatomical vector
TW202125415A (zh) 三維目標檢測及模型的訓練方法、設備、儲存媒體
WO2023044605A1 (zh) 极端环境下脑结构的三维重建方法、装置及可读存储介质
WO2022252929A1 (zh) 医学图像中组织结构的层级分割方法、装置、设备及介质
CN112614133A (zh) 一种无锚点框的三维肺结节检测模型训练方法及装置
Raut et al. Gastrointestinal tract disease segmentation and classification in wireless capsule endoscopy using intelligent deep learning model
US20240005650A1 (en) Representation learning
CN111275699A (zh) 医学图像的处理方法、装置、设备及存储介质
Chatterjee et al. A survey on techniques used in medical imaging processing
CN113808130B (zh) 肿瘤图像智能分类方法、装置、设备和存储介质
US20230087494A1 (en) Determining image similarity by analysing registrations
KR102464422B1 (ko) 폐 손상 진단 장치, 진단에 필요한 정보 제공 방법 및 기록매체
US11861846B2 (en) Correcting segmentation of medical images using a statistical analysis of historic corrections
CN114724016A (zh) 图像分类方法、计算机设备和存储介质
US11734849B2 (en) Estimating patient biographic data parameters
CN117616467A (zh) 训练并使用深度学习算法来基于降维表示比较医学图像的方法
CN116490903A (zh) 表示学习
SANONGSIN et al. A New Deep Learning Model for Diffeomorphic Deformable Image Registration Problems
CN115908392A (zh) 图像评估方法及装置、可读存储介质及电子设备
CN117745693A (zh) 病灶靶区的勾画方法、装置、设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20815327

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020815327

Country of ref document: EP

Effective date: 20220103