CN111597946B - Processing method of image generator, image generation method and device - Google Patents

Processing method of image generator, image generation method and device Download PDF

Info

Publication number
CN111597946B
CN111597946B CN202010392503.3A CN202010392503A CN111597946B CN 111597946 B CN111597946 B CN 111597946B CN 202010392503 A CN202010392503 A CN 202010392503A CN 111597946 B CN111597946 B CN 111597946B
Authority
CN
China
Prior art keywords
image
sample
source domain
domain
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010392503.3A
Other languages
Chinese (zh)
Other versions
CN111597946A (en
Inventor
谢鑫鹏
陈嘉伟
李悦翔
马锴
郑冶枫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Healthcare Shenzhen Co Ltd
Original Assignee
Tencent Healthcare Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Healthcare Shenzhen Co Ltd filed Critical Tencent Healthcare Shenzhen Co Ltd
Priority to CN202010392503.3A priority Critical patent/CN111597946B/en
Publication of CN111597946A publication Critical patent/CN111597946A/en
Application granted granted Critical
Publication of CN111597946B publication Critical patent/CN111597946B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Evolutionary Computation (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a processing method of an image generator, an image generating method and an image generating device. The processing method of the image generator comprises the following steps: acquiring a source domain image sample and a reference image sample; generating a target generation image of the source domain image sample in a target domain through an image generator; respectively extracting a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image; generating a positive sample according to the first content feature and the third content feature, and generating a negative sample according to at least one of the first content feature and the third content feature and the second content feature; inputting the positive sample and the negative sample into a mutual information discriminator, carrying out iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached. By adopting the method, the image after the migration can be prevented from deforming.

Description

Processing method of image generator, image generation method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a processing method of an image generator, an image generating method and an image generating device.
Background
With the development of artificial intelligence, machine learning models are more and more widely used. And training the machine learning model to enable the machine learning model to perform data processing. For example, the medical image processing model is trained, so that the medical image processing model can realize classification or segmentation of medical images.
Currently, when a machine learning model is trained by image samples, the image samples may come from different image domains, such as dark image samples of a training set and bright image samples of a testing set, which may affect the performance of the machine learning model. In the conventional technology, image samples can be migrated to the same image domain, and then a machine learning model is trained. However, the conventional image migration method may cause the image after the migration to be deformed.
Disclosure of Invention
In view of the above, it is desirable to provide a processing method of an image generator, an image generating method, and an image generating apparatus, which can avoid distortion of an image after transition.
A method of processing by an image generator, the method comprising:
acquiring an image sample set, an image generator and a mutual information discriminator; the image sample set comprises source domain image samples and reference image samples;
generating a target generation image of the source domain image sample in a target domain through an image generator;
respectively extracting a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image;
generating a positive sample according to the first content feature and the third content feature, and generating a negative sample according to at least one of the first content feature and the third content feature and the second content feature;
inputting the positive sample and the negative sample into a mutual information discriminator, carrying out iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
A processing apparatus of an image generator, the apparatus comprising:
the acquisition module is used for acquiring the image sample set, the image generator and the mutual information discriminator; the image sample set comprises source domain image samples and reference image samples;
the generating module is used for generating a target generation image of the source domain image sample in the target domain through the image generator;
the extraction module is used for respectively extracting a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image;
the generating module is further used for generating a positive sample according to the first content characteristic and the third content characteristic, and generating a negative sample according to at least one of the first content characteristic and the third content characteristic and the second content characteristic;
and the input module is used for inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing the mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring an image sample set, an image generator and a mutual information discriminator; the image sample set comprises source domain image samples and reference image samples;
generating a target generation image of the source domain image sample in a target domain through an image generator;
respectively extracting a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image;
generating a positive sample according to the first content feature and the third content feature, and generating a negative sample according to at least one of the first content feature and the third content feature and the second content feature;
inputting the positive sample and the negative sample into a mutual information discriminator, carrying out iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring an image sample set, an image generator and a mutual information discriminator; the image sample set comprises source domain image samples and reference image samples;
generating a target generation image of the source domain image sample in a target domain through an image generator;
respectively extracting a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image;
generating a positive sample according to the first content feature and the third content feature, and generating a negative sample according to at least one of the first content feature and the third content feature and the second content feature;
inputting the positive sample and the negative sample into a mutual information discriminator, carrying out iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
The processing method, the device, the computer equipment and the storage medium of the image generator are used for acquiring a source domain image sample, a reference image sample, the image generator and a mutual information discriminator, generating a target generation image of the source domain image sample in a target domain through the image generator, respectively extracting a first content characteristic of the source domain image sample, a second content characteristic of the reference image sample and a third content characteristic of the target generation image, generating a positive sample according to the first content characteristic and the third content characteristic, generating a negative sample according to at least one of the first content characteristic and the third content characteristic and the second content characteristic, inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing the mutual information of the first content characteristic and the third content characteristic in the confrontation training process, until an iteration stop condition is reached. In this way, through the countertraining between the mutual information discriminator and the image generator, when the image generator transfers the image from the source domain to the target domain, the image of the target domain is consistent with the image of the source domain in content characteristics, so that the image deformation of the target domain is avoided; moreover, when the target generation image is used for training the medical image processing model, the performance of the medical image processing model can be improved.
An image generation method, the method comprising:
acquiring an image to be migrated;
determining a source domain to which an image to be migrated belongs and a target domain to which the image to be migrated belongs;
querying an image generator for migrating an image belonging to a source domain to a target domain;
generating a migration image of an image to be migrated in a target domain through an image generator; the content characteristics of the transferred image and the image to be transferred are the same;
the image generator is obtained through iterative confrontation training with the mutual information discriminator, and target parameters are iteratively maximized in the confrontation training process; the target parameter is mutual information between the content characteristics of the source domain image sample and the content characteristics of the generated image sample; the source domain image sample belongs to a source domain; the generated image samples belong to the target domain and are generated by an image generator from the source domain image samples.
An image generation apparatus, the apparatus comprising:
the acquisition module is used for acquiring an image to be migrated;
the determining module is used for determining a source domain to which the image to be migrated belongs and a target domain to which the image to be migrated belongs;
a query module for querying an image generator for migrating an image belonging to a source domain to a target domain;
the generation module is used for generating a migration image of the image to be migrated in the target domain through the image generator; the content characteristics of the transferred image and the image to be transferred are the same;
the image generator is obtained through iterative confrontation training with the mutual information discriminator, and target parameters are iteratively maximized in the confrontation training process; the target parameter is mutual information between the content characteristics of the source domain image sample and the content characteristics of the generated image sample; the source domain image sample belongs to a source domain; the generated image samples belong to the target domain and are generated by an image generator from the source domain image samples.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring an image to be migrated;
determining a source domain to which an image to be migrated belongs and a target domain to which the image to be migrated belongs;
querying an image generator for migrating an image belonging to a source domain to a target domain;
generating a migration image of an image to be migrated in a target domain through an image generator; the content characteristics of the transferred image and the image to be transferred are the same;
the image generator is obtained through iterative confrontation training with the mutual information discriminator, and target parameters are iteratively maximized in the confrontation training process; the target parameter is mutual information between the content characteristics of the source domain image sample and the content characteristics of the generated image sample; the source domain image sample belongs to a source domain; the generated image samples belong to the target domain and are generated by an image generator from the source domain image samples.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring an image to be migrated;
determining a source domain to which an image to be migrated belongs and a target domain to which the image to be migrated belongs;
querying an image generator for migrating an image belonging to a source domain to a target domain;
generating a migration image of an image to be migrated in a target domain through an image generator; the content characteristics of the transferred image and the image to be transferred are the same;
the image generator is obtained through iterative confrontation training with the mutual information discriminator, and target parameters are iteratively maximized in the confrontation training process; the target parameter is mutual information between the content characteristics of the source domain image sample and the content characteristics of the generated image sample; the source domain image sample belongs to a source domain; the generated image samples belong to the target domain and are generated by an image generator from the source domain image samples.
The image generation method, the image generation device, the computer equipment and the storage medium obtain an image to be migrated, determine a source domain to which the image to be migrated belongs and a target domain to which the image to be migrated belongs, query an image generator for migrating the image belonging to the source domain to the target domain, generate a migration image of the image to be migrated in the target domain through the image generator, wherein the migration image has the same content characteristics as the image to be migrated, the image generator is obtained through iterative countertraining with a mutual information discriminator, and iteratively maximize a target parameter in the countertraining process, the target parameter is mutual information between content characteristics of an image sample in the source domain and content characteristics of a generated image sample, the image sample in the source domain belongs to the source domain, and the generated image sample belongs to the target domain and is generated according to the image sample in the source domain through the image generator. In this way, through the countertraining between the mutual information discriminator and the image generator, when the image generator transfers the image from the source domain to the target domain, the image of the target domain is consistent with the image of the source domain in content characteristics, so that the image deformation of the target domain is avoided; moreover, when the target generation image is used for training the medical image processing model, the performance of the medical image processing model can be improved.
Drawings
FIG. 1 is a diagram of an application environment of a processing method of an image generator in one embodiment;
FIG. 2 is a data flow diagram illustrating a processing method of an image generator in one embodiment;
FIG. 3 is a flow diagram that illustrates a processing method of the image generator in one embodiment;
FIG. 4 is a schematic illustration of an image sample in one embodiment;
FIG. 5 is a block diagram of a training system of the image generator in one embodiment;
FIG. 6 is a block diagram showing the construction of a training system of an image generator in another embodiment;
FIG. 7 is a block diagram showing a configuration of a training system of an image generator in still another embodiment;
FIG. 8 is a block diagram showing a configuration of a training system of an image generator in still another embodiment;
FIG. 9 is a block diagram of a training system of the image generator in one embodiment;
FIG. 10 is a flowchart showing a processing method of an image generator in still another embodiment;
FIG. 11 is a flow diagram illustrating a method of image generation in one embodiment;
FIG. 12 is a block diagram showing the configuration of a processing means of an image generator in one embodiment;
FIG. 13 is a block diagram showing the configuration of an image generating apparatus according to an embodiment;
FIG. 14 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.
The scheme provided by the embodiment of the application relates to the technologies such as machine learning of artificial intelligence and the like, and is specifically explained by the following embodiment:
the processing method of the image generator provided by the application can be applied to the application environment shown in FIG. 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 acquires an image sample set, an image generator and a mutual information discriminator, wherein the image sample set comprises a source domain image sample and a reference image sample, and uploads the acquired image sample set, the image generator and the mutual information discriminator to the server 104; the server 104 generates a target generation image of the source domain image sample in the target domain through the image generator; the server 104 respectively extracts a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image; the server 104 generates a positive sample according to the first content feature and the third content feature, and generates a negative sample according to at least one of the first content feature and the third content feature and the second content feature; the server 104 inputs the positive sample and the negative sample into the mutual information discriminator, performs iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizes mutual information of the first content feature and the third content feature in the confrontation training process until an iteration stop condition is reached.
The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud storage, network services, cloud communication, big data, and an artificial intelligence platform. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.
The image generation method provided by the application can also be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 acquires an image to be migrated and uploads the image to be migrated to the server 104; when the server 104 acquires the image to be migrated, determining a source domain to which the image to be migrated belongs and a target domain to which the image to be migrated belongs; server 104 queries an image generator for migrating images belonging to the source domain to the target domain; the server 104 generates a migration image of the image to be migrated in the target domain through the image generator; the content characteristics of the transferred image and the image to be transferred are the same; the image generator is obtained through iterative confrontation training with the mutual information discriminator, and target parameters are iteratively maximized in the confrontation training process; the target parameter is mutual information between the content characteristics of the source domain image sample and the content characteristics of the generated image sample; the source domain image sample belongs to a source domain; the generated image samples belong to a target domain and are generated from source domain image samples by an image generator.
In a specific embodiment, as shown in fig. 2, a front end running on a terminal 102 may obtain an image to be migrated provided by a user, the front end uploads the image to be migrated provided by the user to a back end (a server 104), and the back end executes an image generation method provided by the present application and feeds back a generated migration image to the front end.
In one embodiment, as shown in fig. 3, a processing method of an image generator is provided, which is described by taking the method as an example applied to the server 104 in fig. 1, and includes the following steps:
step 302, an image sample set, an image generator and a mutual information discriminator are obtained, wherein the image sample set comprises a source domain image sample and a reference image sample.
In the present application, model developers have designed an active learning network that includes an image generator and a mutual information discriminator. The image generator is used for transferring an original image of one image domain (source domain) to another image domain (target domain) to obtain a generated image belonging to the target domain. Because the generated image obtained by the migration of the image generator may be deformed (for example, objects in the generated image disappear or deform), that is, the generated image and the original image have a difference in image content, a mutual information discriminator is introduced to generate a countermeasure framework to constrain the image content of the generated image, so that the generated image and the original image are kept consistent in image content, and the generated image is prevented from being deformed.
Wherein the image sample set is a data set for the confrontation training image generator and the mutual information discriminator. The image domains to which the source domain image sample and the reference image sample belong may be the same or different. The image domains are different, and the image styles are different, which mainly shows that the images have differences in color and brightness. A source domain image sample is an image belonging to the source domain. The reference image sample is an image belonging to a source domain or a target domain. It will be appreciated that the source domain is used to characterize the domain of the image in which the image was located prior to migration. The target domain is used for representing the image domain where the image is located after the image is migrated.
In one embodiment, the image samples in the image sample set may be medical image samples. A medical image is a special and dedicated image in the medical field, and refers to an image of an internal tissue (stomach, abdomen, heart, brain, or the like) of a target object, which is obtained in a non-invasive manner for medical treatment or medical research. In different medical scenes, the obtained medical images are different due to different imaging devices or different imaging modes and the like. Examples of the images include images generated by medical instruments such as fundus images, endoscopic images, Computed Tomography (CT) images, Magnetic Resonance Imaging (MRI) images, ultrasound (US, e.g., B-ultrasound, color doppler ultrasound, heart color ultrasound, three-dimensional color ultrasound, etc.), X-ray images, electrocardiograms, electroencephalograms, and optical photography.
The medical images in different medical scenes can be regarded as images of different image domains; for example, the fundus image and the endoscopic image belong to different image domains, respectively. Medical images obtained by different imaging devices or imaging modes in the same medical scene can also be regarded as images of different image domains; for example, fundus images acquired by different fundus cameras belong to different image domains, respectively.
In one embodiment, the image samples in the image sample set may be regular images captured by an image capturing device, such as landscape images or people images captured by a camera. Different image domains such as oil or water color domains, etc. The image samples in the image sample set may also be video frames in a video.
In one particular embodiment, the image sample set may employ a training set as presently disclosed in the field of machine learning. Such as a refer dataset, a Lung node analysis (LUNA) 16 pulmonary Nodule detection dataset, a MICCAI (medical Image Computing and Computer Assisted assessment society) pancreas segmentation dataset, an ImageNet dataset, a MicroSoft COCO dataset, and the like.
For example, as shown in fig. 4, in the present embodiment, the image sample set is specifically a reference data set, and the reference training set and the test set of the reference are acquired by different fundus cameras respectively, so that the image domain to which the image sample of the reference training set and the image sample of the test set belong is different, and the brightness of the image sample of the reference training set is mainly lower than that of the image sample of the test set. Then, when the test set in fig. 4 is used to test the model trained by the training set in fig. 4, the images in the test set need to be migrated to the image domain where the images in the training set are located, and then the model test is performed. This may improve the performance of the training model.
It can be understood that, when the medical image processing model is trained through the image sample set, the performance of the medical image processing model is reduced because the image domain of the image samples of the training set is different from that of the image samples of the test set. Therefore, the image samples of the training set and the image samples of the test set can be migrated to the same image domain, for example, the image samples of the training set are migrated to the image domain to which the image samples of the test set belong, or the image samples of the test set are migrated to the image domain to which the image samples of the training set belong, so that the performance of the medical image processing model is improved.
And 304, generating a target generation image of the source domain image sample in the target domain through the image generator.
The target generated image is a real image obtained by transferring a source domain image sample from a source domain to a target domain through an image generator.
In a specific embodiment, the image Generator may employ a Generator (Generator) of GAN (Generative adaptive Networks), or the like.
Step 306, respectively extracting a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image.
In the application, since the image content of the target generated image obtained by the image generator migration may change, that is, the content characteristics of the target generated image and the source domain image sample are different, the first content characteristic of the source domain image sample and the third content characteristic of the target generated image are extracted, and the first content characteristic of the source domain image sample and the third content characteristic of the target generated image are constrained by the mutual information discriminator, so that the source domain image sample and the target generated image are consistent in the image content.
In particular, the mutual information discriminator may be optimized by a classification task. The purpose of introducing the mutual information discriminator is: and constraining the first content features of the source domain image samples and the third content features of the target generation images so that the content features of the source domain image samples and the target generation images are the same. Therefore, the first content feature and the third content feature have no obvious distinguishability, and a training sample with obvious distinguishability needs to be constructed. Then, based on the above concept, the first content feature and the third content feature are used as a training sample, a reference image sample is selected from the source domain or the target domain, a second content feature of the reference image sample is extracted, and another training sample is constructed according to the second content feature and the first content feature, or the second content feature and the third content feature.
The content features may include texture features, shape features, spatial relationship features, and the like of the image. Texture features describe the surface properties of objects (e.g., scenes, people, objects, etc.) to which an image or image region corresponds. The shape features are represented in two types, one is outline features, the other is region features, the outline features mainly aim at the outer boundary of the target, and the region features are related to the whole shape region of the target. The spatial relationship characteristic refers to the mutual spatial position or relative direction relationship among a plurality of targets segmented from the image, and these relationships can be also divided into a connection/adjacency relationship, an overlapping/overlapping relationship, an inclusion/containment relationship, and the like.
The content characteristics of the reference image sample are different from those of the source domain image sample and the target generation image.
In a specific embodiment, an image can be randomly extracted from other images of the source domain except for the image sample of the source domain as a reference image sample; or randomly extracting an image from other images of the target domain except the target generated image as a reference image sample.
Step 308, generating a positive sample according to the first content characteristic and the third content characteristic, and generating a negative sample according to at least one of the first content characteristic and the third content characteristic and the second content characteristic.
In the present application, although there is a difference between the first content characteristic and the third content characteristic, the difference is smaller than the difference between the first content characteristic and the second content characteristic, and the difference between the second content characteristic and the third content characteristic. And generating a positive sample according to the first content characteristic and the third content characteristic, and generating a negative sample according to at least one of the first content characteristic and the third content characteristic and the second content characteristic, so that the mutual information discriminator can gradually distinguish the positive sample from the negative sample in the training process.
The positive sample comprises a first content feature, a third content feature and a corresponding training label, and the negative sample comprises the first content feature, the second content feature and the corresponding training label, or the second content feature, the third content feature and the corresponding training label, or the first content feature, the second content feature, the third content feature and the corresponding training label.
It is to be understood that "positive" and "negative" are used herein only to distinguish the training samples and do not constitute a limitation on the training labels of the training samples, i.e., it is also possible to generate negative samples according to the first content feature and the third content feature, and generate positive samples according to the second content feature and at least one of the first content feature and the third content feature.
In one embodiment, the training label for the positive sample may be "real" and the training label for the negative sample may be "fake".
And 310, inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
In the application, the image generator and the mutual information discriminator are subjected to iterative confrontation training by taking the positive samples and the negative samples as confrontation sample pairs. The confrontation training means that the image generator and the mutual information discriminator form a dynamic 'game process', and the confrontation training mutually confronts and promotes each other.
Wherein mutual information is used to characterize interdependencies between variables. The larger the mutual information between the first content feature and the third content feature, the more similar the distribution of the first content feature and the third content feature.
In a specific embodiment, the mutual information discriminator may be a general discriminator.
In a particular embodiment, an adaptive moment estimation (Adam) optimizer may be used to perform parameter optimization on the image generator and the mutual information discriminator during the confrontational training process. In the optimization process, through continuously updating the model parameters and the bias parameters, in each iteration process, the error of the prediction result is calculated and reversely propagated to the model, the gradient is calculated, and the parameters of the model are updated.
Specifically, as training progresses, under the constraint of the mutual information discriminator, the difference of content features between the source domain image sample and the target generation image gradually becomes smaller; under the constraint of an image generator, the mutual information discriminator has higher and higher classification accuracy. And, with the optimization of the parameter of the mutual information discriminator, the mutual information of the first content characteristic and the third content characteristic is gradually increased.
It can be understood that by the method provided by the application, the image migration from the target domain to the source domain can be trained, and bidirectional image migration and multidirectional image migration are realized.
In the processing method of the image generator, a source domain image sample, a reference image sample, the image generator and a mutual information discriminator are obtained, generating a target generation image of the source domain image sample in the target domain by an image generator, respectively extracting a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image, generating a positive sample according to the first content feature and the third content feature, generating a negative sample according to at least one of the first content feature and the third content feature and the second content feature, inputting the positive sample and the negative sample into a mutual information discriminator, and performing iterative countermeasure training on the image generator and the mutual information discriminator, and iteratively maximizing the mutual information of the first content characteristic and the third content characteristic in the countermeasure training process until an iteration stop condition is reached. In this way, through the countertraining between the mutual information discriminator and the image generator, when the image generator transfers the image from the source domain to the target domain, the image of the target domain is consistent with the image of the source domain in content characteristics, so that the image deformation of the target domain is avoided; moreover, when the target generation image is used for training the medical image processing model, the performance of the medical image processing model can be improved.
In one embodiment, inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content feature and the third content feature during the confrontation training until an iteration stop condition is reached, comprises: characterizing mutual information of the first content feature and the third content feature by relative entropy of the first content feature and the third content feature; constructing a discrimination loss function of a mutual information discriminator according to the cross entropy of the first content characteristic and the third content characteristic; the cross entropy is positively correlated with the relative entropy; inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator by combining a discriminant loss function, iteratively optimizing the discriminant loss function in the confrontation training process, and maximizing mutual information.
It can be understood that the discriminant loss function of the mutual information discriminator is constructed according to the cross entropy of the first content feature and the third content feature. And the discrimination loss function is used for optimizing the parameters of the mutual information discriminator according to the error of the prediction result. And the cross entropy is related to the relative entropy as follows:
H(Za;Zab)=H(Za)+I(DV)(Za;Zab)
wherein, H (Z)a;Zab) Is the cross entropy of the first content feature and the third content feature, H (Z)a) Information entropy, I, for the first and third content characteristics(DV)(Za;Zab) DV (Donsker-Varadhan) representation of KL (Kullback-Leibler) divergence, which is the relative entropy of the first content feature and the third content feature.
Since the information entropy of the first content feature and the third content feature is fixed, the cross entropy is positively correlated with the relative entropy. In the process of confrontation training, parameters of the mutual information discriminator are gradually optimized, the cross entropy is smaller and smaller, and the relative entropy is smaller and smaller. The relative entropy can measure the similarity between two distributions, and the smaller the relative entropy is, the more similar the two distributions are, and the larger the mutual information between the two distributions is. Then, based on the above concept, mutual information of the first content feature and the third content feature can be characterized by relative entropy of the first content feature and the third content feature, so that mutual information of the first content feature and the third content feature becomes larger and larger during the countermeasure training.
Specifically, a lower bound of mutual information of the first content feature and the third content feature is defined by relative entropy, and the formula is as follows:
Figure BDA0002486341100000131
wherein, I(DV)(Za;Zab) DV (Donsker-Varadhan) representation of KL (Kullback-Leibler) divergence, which is the relative entropy of the first content feature and the third content feature; eJ[DMI(Za;Zab)]Entropy distributed jointly for the first content feature and the third content feature;
Figure BDA0002486341100000132
entropy distributed for the first content feature and the third content feature edge; dMI:Za×Zab→ R is the mutual information discriminator function, R is the real space.
Specifically, in the countermeasure training process, the mutual information discriminator gradually distinguishes positive samples and negative samples, parameters of the mutual information discriminator are gradually optimized, the lower bound of the mutual information of the first content feature and the third content feature is more and more accurate, and therefore the mutual information of the first content feature and the third content feature is more and more large.
In this embodiment, mutual information of the first content feature and the third content feature is represented by relative entropy of the first content feature and the third content feature; constructing a discrimination loss function of a mutual information discriminator according to the cross entropy of the first content characteristic and the third content characteristic; the cross entropy is positively correlated with the relative entropy; inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator by combining a discrimination loss function, and iteratively optimizing the discrimination loss function and maximizing the mutual information in the confrontation training process, so that the parameter of the mutual information discriminator is optimized by a classification task to realize the maximization of the mutual information of the first content characteristic and the third content characteristic.
In one embodiment, the extracting the first content feature of the source domain image sample, the second content feature of the reference image sample, and the third content feature of the target generation image respectively includes: acquiring a source domain encoder and a target domain encoder; coding the source domain image sample into a feature space through a source domain coding model to obtain a first content feature of the source domain image sample; and coding the reference image sample into the feature space through the target domain coder to obtain a second content feature of the reference image sample, and coding the target generated image into the feature space through the target domain coder to obtain a third content feature of the target generated image.
The source domain encoder is used for extracting features from an image of a source domain, and embedding the extracted features into a feature space to obtain a feature vector. The target domain encoder is used for extracting features from the image of the target domain, and embedding the extracted features into the same feature space to obtain feature vectors. The feature space is used for storing feature vectors.
In a specific embodiment, the source domain encoder and the target domain encoder may be general encoders.
In this embodiment, a source domain encoder and a target domain encoder are obtained, a source domain image sample is encoded into a feature space by the source domain encoder to obtain a first content feature of the source domain image sample, a reference image sample is encoded into the feature space by the target domain encoder to obtain a second content feature of the reference image sample, and a target generated image is encoded into the feature space by the target domain encoder to obtain a third content feature of the target generated image.
In one embodiment, inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content feature and the third content feature during the confrontation training until an iteration stop condition is reached, comprises: inputting the positive sample and the negative sample into a mutual information discriminator, carrying out iterative confrontation training on the image generator, the source domain encoder, the target domain encoder and the mutual information discriminator, and iteratively maximizing the mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
Specifically, as shown in fig. 5, fig. 5 is a block diagram of a training system of an image generator in one embodiment. And Ia is a source domain image sample, Ib is a target generation image, Ic is a reference image sample, Za is a first content characteristic, Zb is a third content characteristic, and Zc is a second content characteristic. In the countercheck training process, the image generator, the source domain encoder, the target domain encoder and the mutual information discriminator form a dynamic game process, and the game process is counterchecked and mutually promoted.
Specifically, as training progresses, under the constraints of a source domain encoder, a target domain encoder and a mutual information discriminator, the image generator generates a target generation image and a source domain image sample, and the difference of content features between the target generation image and the source domain image sample gradually becomes smaller; the source domain encoder extracts the features of the source domain image sample more and more accurately under the constraints of the image generator, the target domain encoder and the mutual information discriminator; under the constraints of the image generator, the source domain encoder and the mutual information discriminator, the target domain encoder extracts the characteristics of the target generated image and the reference image sample more and more accurately; under the constraints of an image generator, a source domain encoder and a target domain encoder, a mutual information discriminator has higher and higher classification accuracy.
In this embodiment, the positive sample and the negative sample are input to the mutual information discriminator, iterative confrontation training is performed on the image generator, the source domain encoder, the target domain encoder and the mutual information discriminator, and mutual information of the first content feature and the third content feature is iteratively maximized in the confrontation training process until an iteration stop condition is reached.
In one embodiment, the method further comprises: acquiring a source domain decoder and a target domain decoder; mapping a first content characteristic of a source domain image sample to a source domain through a source domain decoder to obtain a first reconstructed image; mapping the first content characteristics of the source domain image sample to a target domain through a target domain decoder to obtain a second reconstructed image; constructing a first loss function based on a difference between the source domain image sample and the first reconstructed image, and constructing a second loss function based on a difference between the target generated image and the second reconstructed image; inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached, wherein the method comprises the following steps: inputting the positive sample and the negative sample into a mutual information discriminator, combining the first loss function and the second loss function, performing iterative confrontation training on the image generator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
It will be appreciated that when the source domain encoder encodes the first content feature into the feature space, the source domain encoder may encode the source domain information into the feature space together; similarly, when the target encoder encodes the second content feature and the third content feature into the feature space, the target domain encoder may encode the target domain information into the feature space together. The source domain information and the target domain information may affect the training efficiency of the image generator to a certain extent. Based on this, the source domain information can be separated from the feature space by the source domain decoder, and the target domain information can be separated from the feature space by the target domain decoder.
The source domain decoder is used for reconstructing the feature vector obtained according to the source domain encoder or the target domain encoder to a source domain; and the target domain decoder is used for reconstructing the feature vector obtained by the source domain encoder or the target domain encoder to a target domain. The first reconstructed image is a real image obtained by reconstructing the first content characteristics to a source domain through a source domain decoder; the second reconstructed image is a real image obtained by reconstructing the first content features to the target domain through the target domain decoder. A first loss function is used to reduce the L1 norm between the source domain image samples and the first reconstructed image; the second loss function is used to reduce the L1 norm between the target generated image and the second reconstructed image.
Specifically, as shown in fig. 6, fig. 6 is a block diagram of a training system of an image generator in another embodiment. Wherein Ia is a source domain image sample, Ib is a target generation image, Za is a first content feature, Zaa is a first reconstruction image, and Zab is a second reconstruction image. When the source domain information is not present in the feature space, the source domain image samples may be infinitely close to the first reconstructed image and the target generated image may be infinitely close to the second reconstructed image. Taking the target generated image and the second reconstructed image as an example, the second reconstructed image is also an image of the source domain image sample corresponding to the target domain, and is only different from the generation mode of the target generated image, and the target generated image does not carry active domain information, so that when the source domain information does not exist in the feature space, the second reconstructed image is in wireless proximity to the target generated image. Therefore, a first loss function is constructed based on the difference between the source domain image sample and the first reconstructed image, and the difference between the source domain image sample and the first reconstructed image is reduced through the first loss function, so that the source domain decoder learns the source domain information; and constructing a second loss function based on a difference between the target generated image and the second reconstructed image, and reducing the difference between the target generated image and the second reconstructed image through the second loss function so that the target domain decoder learns the target domain information.
In one embodiment, the third content feature of the target generated image may be mapped to the source domain by the source domain decoder, resulting in a third reconstructed image; mapping the third content characteristics of the target generated image to a target domain through a target domain decoder to obtain a fourth reconstructed image; constructing a third loss function based on the difference between the source domain image sample and the third reconstructed image, and constructing a fourth loss function based on the difference between the target generation image and the fourth reconstructed image; and inputting the positive sample and the negative sample into a mutual information discriminator, and combining a third loss function and a fourth loss function.
The third reconstructed image is a real image obtained by reconstructing the third content characteristics to the source domain through a source domain decoder; the fourth reconstructed image is a real image obtained by reconstructing the third content feature to the target domain through the target domain decoder. A third loss function is used to reduce the L1 norm between the source domain image samples and the third reconstructed image; the fourth loss function is used to reduce the L1 norm between the target generated image and the fourth reconstructed image.
Specifically, as shown in fig. 7, fig. 7 is a block diagram of a training system of an image generator in a further embodiment. Wherein Ia is a source domain image sample, Ib is a target generated image, Zb is a third content feature, Zba is a third reconstructed image, and Zbb is a fourth reconstructed image. When the target domain information does not exist in the feature space, the source domain image sample can be infinitely close to the third reconstructed image, and the target generation image is infinitely close to the fourth reconstructed image. Therefore, a third loss function is constructed based on the difference between the source domain image sample and the third reconstructed image, and the difference between the source domain image sample and the third reconstructed image is reduced through the third loss function, so that the source domain decoder learns the source domain information; and constructing a fourth loss function based on the difference between the target generation image and the fourth reconstruction image, and reducing the difference between the target generation image and the fourth reconstruction image through the fourth loss function so that the target domain decoder learns the target domain information.
In one embodiment, the extracting the first content feature of the source domain image sample, the second content feature of the reference image sample, and the third content feature of the target generation image respectively includes: acquiring a source domain encoder and a target domain encoder; encoding the source domain image sample into a feature space through a source domain encoder to obtain a first encoding feature of the source domain image sample; the first encoding characteristic comprises at least a first content characteristic; encoding the reference image sample into a feature space through a target domain encoder to obtain a second encoding feature of the reference image sample; the second coding features at least comprise second content features, and the target generated image is coded into a feature space through a target domain coder to obtain third coding features of the target generated image; the third encoding characteristic includes at least a third content characteristic.
The processing method of the image generator further comprises: acquiring a source domain decoder and a target domain decoder; mapping a first coding characteristic of a source domain image sample to a source domain through a source domain decoder to obtain a first reconstructed image; mapping the first coding characteristics of the source domain image sample to a target domain through a target domain decoder to obtain a second reconstructed image; a first loss function is constructed based on differences between the source domain image samples and the first reconstructed image, and a second loss function is constructed based on differences between the target generated image and the second reconstructed image.
Inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached, wherein the method comprises the following steps: inputting the positive sample and the negative sample into a mutual information discriminator, combining a first loss function and a second loss function, carrying out iterative confrontation training on the image generator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder and the mutual information discriminator, iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process, and until the first coding characteristic only comprises the first content characteristic, the second coding characteristic only comprises the second content characteristic, the third coding characteristic only comprises the third content characteristic and an iteration stop condition is reached.
It is understood that in the early stages of iterative counter-training, the coding features resulting from the encoder coding include content features and domain features. For example, the source domain encoder encodes the source domain image samples to obtain a first encoding characteristic, which includes a first content characteristic and a source domain characteristic; for another example, the target domain encoder encodes the target-generating image to obtain a third encoding characteristic, which includes a third content characteristic and a target domain characteristic. In an iterative confrontation training process, the decoder learns to recover the domain features during the decoding process by simultaneously minimizing the first loss function and the second loss function such that the encoder learns to perform content feature dissociation, i.e., to remove the domain features, during the encoding process. In this way, at the later stage of the iterative confrontation training, the coding features obtained by the coding of the coder can only include the content features, and can be better used for the mutual information discriminator to maximize the mutual information between the content features. It should be noted that the encoding features, including only the content features, refer to the target state, allowing for errors that are tolerable in reality. Also, since the mapping between the source domain and the target domain is symmetrical, the mapping from the target domain to the source domain, the content feature dissociation during encoding, and the feature recovery during decoding can be similar to the previous embodiment.
In a specific embodiment, the source domain decoder and the target domain decoder can both adopt a universal decoder.
Specifically, in the countermeasure training process, the image generator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder and the mutual information discriminator form a dynamic game process, and the game process is mutually confronted and mutually promoted. With the training, the difference of content features between a target generated image generated by the image generator and a source domain image sample is gradually reduced, the source domain encoder extracts the features of the source domain image sample more and more accurately, the target domain encoder extracts the features of the target generated image and a reference image sample more and more accurately, the source domain decoder gradually separates source domain information from a feature space, and the target domain decoder gradually separates target domain information from the feature space, so that the mutual information discriminator has higher and higher classification accuracy.
In a particular embodiment, the second content feature may be from an image sample of the target domain while generating positive samples from the first content feature and the third content feature and generating negative samples from the first content feature and the second content feature. The first content characteristic comes from the source domain, the third content characteristic comes from the target domain, and if the second content characteristic also comes from the target domain, the interference of the domain information in the training process can be reduced. Likewise, the second content feature may be from an image sample of the source domain when generating the negative sample from the second content feature and the third content feature.
In this embodiment, a source domain decoder and a target domain decoder are obtained, a first content feature of a source domain image sample is mapped to a source domain through the source domain decoder to obtain a first reconstructed image, a first content feature of the source domain image sample is mapped to a target domain through the target domain decoder to obtain a second reconstructed image, a first loss function is constructed based on a difference between the source domain image sample and the first reconstructed image, a second loss function is constructed based on a difference between a target generated image and the second reconstructed image, a positive sample and a negative sample are input to a mutual information discriminator, and an image generator, a source domain encoder, a target domain encoder, a source domain decoder, a target domain decoder and a mutual information discriminator are subjected to iterative countertraining in combination with the first loss function and the second loss function, and mutual information of the first content feature and a third content feature is iteratively maximized in a countertraining process, and the training efficiency of the image generator is improved through the iterative confrontation training of the image generator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder and the mutual information discriminator.
In one embodiment, the method further comprises: acquiring an image discriminator; inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached, wherein the method comprises the following steps: inputting a source domain image sample and a reference image sample into an image discriminator, inputting a positive sample and a negative sample into a mutual information discriminator, performing iterative dual-countercheck training on an image generator, the image discriminator and the mutual information discriminator, and iteratively maximizing mutual information in the countercheck training process until an iteration stop condition is reached.
Wherein the image discriminator is used for carrying out countermeasure training with the image generator.
It can be understood that because the quality of the image generated by the image generator lacks constraints, an image discriminator is introduced to improve the image quality of the target generated image based on the generation of the antagonistic frame. As shown in fig. 8, fig. 8 is a block diagram of a training system of an image generator in a further embodiment. The image generator generates a target generation image according to the source domain image sample, the source domain image sample and the target generation image are input into the image discriminator, and the image discriminator distinguishes the source domain image sample and the target generation image, so that the image generator and the image discriminator are trained together in continuous confrontation.
Specifically, as training progresses, under the constraint of the image discriminator and the mutual information discriminator, the difference of content features between the source domain image sample and the target generation image gradually becomes smaller; under the constraint of the image generator and the image discriminator, the mutual information discriminator has higher and higher classification accuracy.
In a specific embodiment, the image Discriminator may use a GAN (generic adaptive Networks) Discriminator (Discriminator), or the like.
In this embodiment, an image discriminator is obtained, a source domain image sample and a reference image sample are input to the image discriminator, a positive sample and a negative sample are input to a mutual information discriminator, an iterative dual countertraining is performed on the image generator, the image discriminator and the mutual information discriminator, and mutual information is iteratively maximized in a countertraining process until an iteration stop condition is reached, so that the performance and the training efficiency of the image generator are improved.
In one embodiment, the image generator is a target domain image generator; the image discriminator is a target domain image discriminator; the method further comprises the following steps: acquiring a source domain image generator and a source domain image discriminator; generating a restored image of the target generation image in the source domain by the source domain image generator; constructing cycle consistency loss of a source domain image generator, a source domain image discriminator, a target domain image generator and a target domain image discriminator; inputting a source domain image sample and a reference image sample into an image discriminator, inputting a positive sample and a negative sample into a mutual information discriminator, performing iterative dual-countercheck training on an image generator, the image discriminator and the mutual information discriminator, and iteratively maximizing mutual information in the countercheck training process until an iteration stop condition is reached, wherein the iterative dual-countercheck training method comprises the following steps: inputting a source domain image sample and a target generation image into a target domain image discriminator, inputting the source domain image sample and a recovery image into the source domain image discriminator, and inputting a positive sample and a negative sample into a mutual information discriminator, performing iterative dual countercheck training on the source domain image generator, the source domain image discriminator, the target domain image generator, the target domain image discriminator and the mutual information discriminator by combining cycle consistency loss, and iteratively maximizing mutual information in a countercheck training process until an iteration stop condition is reached.
The target domain image generator is used for migrating an image from the source domain to the target domain; the target domain image discriminator is used for discriminating the image generated by the target domain image generator; the source domain image generator is used for migrating an image from the target domain to the source domain; the source domain image discriminator is used for discriminating the image generated by the source domain image generator. The restored image is a real image of the generated target generation image in the source domain by the source domain image generator. The cycle consistency loss comprises a loss calculated by the target domain image discriminator and a loss calculated by the source domain image discriminator, and is used for optimizing parameters of the source domain image generator, the source domain image discriminator, the target domain image generator and the target domain image discriminator.
It can be understood that due to the lack of constraints on the quality of the image generated by the target domain image generator, the source domain image generator, the target domain image discriminator and the source domain image discriminator are introduced, and the quality of the target generated image is improved based on the generation countermeasure framework. The target domain image generator generates a target generation image according to the source domain image sample, the source domain image sample and the target generation image are input into a target domain image discriminator, and the target domain image discriminator distinguishes the source domain image sample and the target generation image; and generating a restored image of the target generated image in the source domain by a source domain image generator, inputting the source domain image sample and the restored image into a source domain image discriminator, and distinguishing the source domain image sample from the restored image by the source domain image discriminator. In this way, the target domain image generator is trained with the source domain image generator, the target domain image discriminator, and the source domain image discriminator in a continuous confrontation.
Specifically, as training progresses, under the constraint of the source domain image generator, the target domain image discriminator, the source domain image discriminator and the mutual information discriminator, the difference of content features between the source domain image sample and the target generation image gradually becomes smaller in the target domain image generator; and the mutual information discriminator has higher and higher classification accuracy under the constraint of the target domain image generator, the source domain image generator, the target domain image discriminator and the source domain image discriminator.
In a specific embodiment, the target domain image Generator and the source domain image Generator may adopt a Generator (Generator) of CycleGAN, etc.; the target domain image Discriminator and the source domain image Discriminator may employ a CycleGAN Discriminator (Discriminator), or the like.
In this embodiment, a source domain image generator and a source domain image discriminator are obtained, a restored image of a target generated image in a source domain is generated by the source domain image generator, a loop consistency loss of the source domain image generator, the source domain image discriminator, the target domain image generator and the target domain image discriminator is constructed, a source domain image sample and a target generated image are input to the target domain image discriminator, a source domain image sample and a restored image are input to the source domain image discriminator, and a positive sample and a negative sample are input to a mutual information discriminator, and a dual countercheck training is iteratively performed on the source domain image generator, the source domain image discriminator, the target domain image generator, the target domain image discriminator and the mutual information discriminator by combining the loop consistency loss, and mutual information is iteratively maximized in a countercheck training process until an iteration stop condition is reached, so that, the performance and the training efficiency of the image generator are improved.
In one embodiment, generating positive samples from the first content characteristic and the third content characteristic and generating negative samples from at least one of the first content characteristic and the third content characteristic, and the second content characteristic comprises: splicing the first content characteristic and the third content characteristic to obtain a positive sample; and splicing at least one of the first content characteristic and the third content characteristic with the second content characteristic to obtain a negative sample.
Wherein content features can be stitched according to channel dimensions. Taking the first content feature and the second content feature as an example, if the first content feature and the second content feature are both 64 × 256, the concatenation of the first content feature and the second content feature is 64 × 512.
It can be understood that the negative samples are used to increase the relative entropy of the positive samples during the classification process by the mutual information discriminator. Therefore, the negative sample only needs to have a certain difference from the positive sample, so that the mutual information discriminator can distinguish the negative sample, and the content features contained in the negative sample are the splicing result of the first content feature and the second content feature, or the splicing result of the third content feature and the second content feature, or the splicing result of the first content feature, the third content feature and the second content feature, so that the relative entropy of the positive sample cannot be influenced.
In this embodiment, the first content feature and the third content feature are spliced to obtain a positive sample, and at least one of the first content feature and the third content feature is spliced with the second content feature to obtain a negative sample, so that the mutual information discriminator can be conveniently trained through the positive sample and the negative sample.
In one embodiment, there is provided a processing method of an image generator, as shown in fig. 9, a training system of the image generator includes: a target domain image generator, a target domain image discriminator, a source domain image generator, a source domain image discriminator, a source domain encoder, a target domain encoder, a source domain decoder, a target domain decoder, and a mutual information discriminator. As shown in fig. 10, the method includes:
step 1002, obtain an image sample set, a target domain image generator, a target domain image discriminator, a source domain image generator, a source domain image discriminator, a source domain encoder, a target domain encoder, a source domain decoder, a target domain decoder, and a mutual information discriminator. The image sample set includes source domain image samples and target domain image samples.
And 1004, generating a target generation image of the source domain image sample in the target domain through the target domain image generator, generating a recovery image of the target generation image in the source domain through the source domain image generator, and constructing the cycle consistency loss of the source domain image generator, the source domain image discriminator, the target domain image generator and the target domain image discriminator.
Step 1006, encoding the source domain image sample into a feature space through a source domain encoder to obtain a first encoding feature of the source domain image sample, where the first encoding feature at least includes a first content feature, encoding the target domain image sample into the feature space through a target domain encoder to obtain a second encoding feature of the target domain image sample, where the second encoding feature at least includes a second content feature, and encoding the target generated image into the feature space through the target domain encoder to obtain a third encoding feature of the target generated image, where the third encoding feature at least includes a third content feature.
Step 1008, mapping, by the source domain decoder, the first coding feature of the source domain image sample to the source domain to obtain a first reconstructed image, mapping, by the target domain decoder, the first coding feature of the source domain image sample to the target domain to obtain a second reconstructed image, constructing a first loss function based on a difference between the source domain image sample and the first reconstructed image, and constructing a second loss function based on a difference between the target generated image and the second reconstructed image.
Step 1010, mapping a third coding feature of the target generated image to a source domain through a source domain decoder to obtain a third reconstructed image; mapping the third coding features of the target generated image to a target domain through a target domain decoder to obtain a fourth reconstructed image; a third loss function is constructed based on the difference between the source domain image samples and the third reconstructed image, and a fourth loss function is constructed based on the difference between the target generated image and the fourth reconstructed image. Step 1012, splicing the first coding feature and the third coding feature to obtain a positive sample, and splicing at least one of the first coding feature and the third coding feature with the second coding feature to obtain a negative sample.
And 1014, constructing a discrimination loss function of the mutual information discriminator according to the cross entropy of the first coding feature and the third coding feature.
Step 1016, inputting the source domain image sample and the target generation image into the target domain image discriminator, inputting the source domain image sample and the recovery image into the source domain image discriminator, and inputting the positive sample and the negative sample into the mutual information discriminator, combining the cycle consistency loss, the first loss function, the second loss function, the third loss function, the fourth loss function and the discrimination loss function, performing iterative dual-confrontation training on the target domain image generator, the target domain image discriminator, the source domain image generator, the source domain image discriminator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder and the mutual information discriminator, iteratively optimizing the discrimination loss function in the confrontation training process, and maximizing the mutual information until the first coding feature only includes the first content feature and the second coding feature only includes the second content feature, The third encoding feature includes only the third content feature and the iteration stop condition is reached.
The device represents mutual information of the first coding feature and the third coding feature through relative entropy of the first coding feature and the third coding feature; the cross entropy is positively correlated with the relative entropy. When the first coding feature only includes the first content feature and the third coding feature only includes the third content feature, mutual information of the first coding feature and the third coding feature, that is, mutual information of the first content feature and the third content feature.
It can be understood that, by the method provided by the present application, image migration from the source domain to the target domain and image migration from the target domain to the source domain can be trained, so that bidirectional image migration and multidirectional image migration are realized.
In the embodiment, through the countertraining between the mutual information discriminator and the image generator, when the image generator transfers the image from the source domain to the target domain, the image of the target domain is consistent with the image of the source domain in content characteristics, so that the image deformation of the target domain is avoided; moreover, when the target generation image is used for training the medical image processing model, the performance of the medical image processing model can be improved.
It will be appreciated that in this embodiment, the computer device constructs a generative confrontation network based on the image generator from the source domain to the target domain to train the image generator in a confrontational training manner. The generative confrontation network comprises two pairs of image generators and discriminators, an X-shaped dual autoencoder and a mutual information discriminator. The X-shaped dual self-encoder comprises a source domain encoder, a source domain decoder, a target domain encoder and a target domain decoder. In the process of the countertraining, an encoder in the X-shaped dual self-encoder learns the capability of removing domain features in the encoding process, a decoder learns the capability of recovering domain information in the decoding process, and a mutual information discriminator learns to maximize mutual information between content features of a source domain image sample and a target generation image. Therefore, the image generator can reduce the deformation of the image content to the maximum extent when the image is generated, and the original image and the generated image have the same content characteristics and do not deform, namely the content characteristics are the same.
In one embodiment, as shown in fig. 11, an image generation method is provided, which is described by taking the method as an example applied to the server 104 in fig. 1, and includes the following steps:
step 1102, acquiring an image to be migrated.
The image to be migrated refers to an image to be subjected to image migration processing.
In one embodiment, the image to be migrated may be a medical image. A medical image is an image of a specific medical field, and refers to an internal tissue image obtained non-invasively with respect to a target object for medical treatment or medical research. Examples of the images include images generated by medical instruments such as fundus images, Computed Tomography (CT) images, Magnetic Resonance Imaging (MRI) images, ultrasound (B-mode ultrasound, color doppler ultrasound, heart color ultrasound, and three-dimensional color ultrasound), X-ray images, electrocardiograms, electroencephalograms, and optical photography.
And 1104, determining a source domain to which the image to be migrated belongs and a target domain to which the image to be migrated belongs.
Wherein, the source domain is used for representing the image domain where the image is located before the image migration. The target domain is used for representing the image domain where the image is located after the image is migrated. The image domains are different, and the image styles are different, which mainly shows that the images have differences in color and brightness.
Step 1106 queries an image generator for migrating images belonging to the source domain to the target domain.
Wherein the image generator is configured to migrate an image from the source domain to the target domain. The image generator may implement one-way image domain migration, two-way image domain migration, and multi-way image domain migration. Unidirectional image domain migration refers to the migration of an image from a first image domain to a second image domain or the migration of an image from the second image domain to the first image domain by an image generator. Bidirectional image migration means that not only an image can be migrated from a first image domain to a second image domain, but also an image can be migrated from the second image domain to the first image domain by the image generator. Multi-directional image domain migration refers to the migration of an image from a source domain to at least two different target domains.
Step 1108, generating a migration image of the image to be migrated in the target domain through the image generator; the content characteristics of the transferred image and the image to be transferred are the same; the image generator is obtained through iterative confrontation training with the mutual information discriminator, and target parameters are iteratively maximized in the confrontation training process; the target parameter is mutual information between the content characteristics of the source domain image sample and the content characteristics of the target generation image; the source domain image sample belongs to a source domain; the target-generated image belongs to a target domain and is generated from the source-domain image samples by an image generator.
The migration image is a real image obtained by migrating the image to be migrated from the source domain to the target domain through the image generator.
The content features may include texture features, shape features, spatial relationship features, and the like of the image. Texture features describe the surface properties of objects (e.g., scenes, people, objects, etc.) to which an image or image region corresponds. The shape features are represented in two types, one is outline features, the other is region features, the outline features mainly aim at the outer boundary of the target, and the region features are related to the whole shape region of the target. The spatial relationship characteristic refers to the mutual spatial position or relative direction relationship among a plurality of targets segmented from the image, and these relationships can be also divided into a connection/adjacency relationship, an overlapping/overlapping relationship, an inclusion/containment relationship, and the like.
The mutual information discriminator is used for performing countermeasure training with the image generator and constraining the image content of the target generation image generated by the image generator so that the target generation image and the source domain image sample are consistent on the image content.
In a specific embodiment, the image Generator may employ a Generator (Generator) of GAN (Generative adaptive Networks), or the like. The mutual information discriminator may be a general discriminator.
It can be understood that, since the image content of the target generated image obtained by the image generator migration may change, that is, the content characteristics of the target generated image and the source domain image sample are different, the first content characteristic of the source domain image sample and the third content characteristic of the target generated image are extracted, and the first content characteristic of the source domain image sample and the third content characteristic of the target generated image are constrained by the mutual information discriminator, so that the source domain image sample and the target generated image are consistent in the image content.
In particular, the mutual information discriminator may be optimized by a classification task. The purpose of introducing the mutual information discriminator is: and constraining the first content features of the source domain image samples and the third content features of the target generation images so that the content features of the source domain image samples and the target generation images are the same. Therefore, the first content feature and the third content feature have no obvious distinguishability, and a training sample with obvious distinguishability needs to be constructed. Then, based on the above concept, the first content feature and the third content feature are used as a training sample, a reference image sample is selected from the source domain or the target domain, a second content feature of the reference image sample is extracted, and another training sample is constructed according to the second content feature and the first content feature, or the second content feature and the third content feature. The content characteristics of the reference image sample are different from those of the source domain image sample and the target generation image.
In a specific embodiment, an image can be randomly extracted from other images of the source domain except for the image sample of the source domain as a reference image sample; or randomly extracting an image from other images of the target domain except the target generated image as a reference image sample.
In the present application, although there is a difference between the first content characteristic and the third content characteristic, the difference is smaller than the difference between the first content characteristic and the second content characteristic, and the difference between the second content characteristic and the third content characteristic. And generating a positive sample according to the first content characteristic and the third content characteristic, and generating a negative sample according to at least one of the first content characteristic and the third content characteristic and the second content characteristic, so that the mutual information discriminator can gradually distinguish the positive sample from the negative sample in the training process.
The positive sample comprises a first content feature, a third content feature and a corresponding training label, and the negative sample comprises the first content feature, the second content feature and the corresponding training label, or the second content feature, the third content feature and the corresponding training label, or the first content feature, the second content feature, the third content feature and the corresponding training label. It is to be understood that "positive" and "negative" are used herein only to distinguish the training samples and do not constitute a limitation on the training labels of the training samples, i.e., it is also possible to generate negative samples according to the first content feature and the third content feature, and generate positive samples according to the second content feature and at least one of the first content feature and the third content feature. In one embodiment, the training label for the positive sample may be "real" and the training label for the negative sample may be "fake".
In a specific embodiment, the first content feature and the third content feature are spliced to obtain a positive sample, and at least one of the first content feature and the third content feature is spliced with the second content feature to obtain a negative sample.
It can be understood that the negative samples are used to increase the relative entropy of the positive samples during the classification process by the mutual information discriminator. Therefore, the negative sample only needs to have a certain difference from the positive sample, so that the mutual information discriminator can distinguish the negative sample, and the content features contained in the negative sample are the splicing result of the first content feature and the second content feature, or the splicing result of the third content feature and the second content feature, or the splicing result of the first content feature, the third content feature and the second content feature, so that the relative entropy of the positive sample cannot be influenced.
In the application, the image generator and the mutual information discriminator are subjected to iterative confrontation training by taking the positive samples and the negative samples as confrontation sample pairs. The confrontation training means that the image generator and the mutual information discriminator form a dynamic 'game process', and the confrontation training mutually confronts and promotes each other.
In a particular embodiment, an adaptive moment estimation (Adam) optimizer may be used to perform parameter optimization on the image generator and the mutual information discriminator during the confrontational training process. In the optimization process, through continuously updating the model parameters and the bias parameters, in each iteration process, the error of the prediction result is calculated and reversely propagated to the model, the gradient is calculated, and the parameters of the model are updated.
Specifically, as training progresses, under the constraint of the mutual information discriminator, the difference of content features between the source domain image sample and the target generation image gradually becomes smaller; under the constraint of an image generator, the mutual information discriminator has higher and higher classification accuracy. And, with the optimization of the parameter of the mutual information discriminator, the mutual information of the first content characteristic and the third content characteristic is gradually increased.
Wherein mutual information is used to characterize interdependencies between variables. The larger the mutual information between the first content feature and the third content feature, the more similar the distribution of the first content feature and the third content feature.
In one embodiment, mutual information of the first content feature and the third content feature is characterized by relative entropy of the first content feature and the third content feature; constructing a discrimination loss function of a mutual information discriminator according to the cross entropy of the first content characteristic and the third content characteristic; the cross entropy is positively correlated with the relative entropy; inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator by combining a discriminant loss function, iteratively optimizing the discriminant loss function in the confrontation training process, and maximizing mutual information.
It can be understood that the discriminant loss function of the mutual information discriminator is constructed according to the cross entropy of the first content feature and the third content feature. And the discrimination loss function is used for optimizing the parameters of the mutual information discriminator according to the error of the prediction result. And the cross entropy is related to the relative entropy as follows:
H(Za;Zab)=H(Za)+I(DV)(Za;Zab)
wherein, H (Z)a;Zab) Is the cross entropy of the first content feature and the third content feature, H (Z)a) Information entropy, I, for the first and third content characteristics(DV)(Za;Zab) DV (Donsker-Varadhan) representation of KL (Kullback-Leibler) divergence, which is the relative entropy of the first content feature and the third content feature.
Since the information entropy of the first content feature and the third content feature is fixed, the cross entropy is positively correlated with the relative entropy. In the process of confrontation training, parameters of the mutual information discriminator are gradually optimized, the cross entropy is smaller and smaller, and the relative entropy is smaller and smaller. The relative entropy can measure the similarity between two distributions, and the smaller the relative entropy is, the more similar the two distributions are, and the larger the mutual information between the two distributions is. Then, based on the above concept, mutual information of the first content feature and the third content feature can be characterized by relative entropy of the first content feature and the third content feature, so that mutual information of the first content feature and the third content feature becomes larger and larger during the countermeasure training.
In one embodiment, a source domain encoder and a target domain encoder are obtained; encoding the source domain image sample into a feature space through a source domain encoder to obtain a first content feature of the source domain image sample; and coding the reference image sample into the feature space through the target domain coder to obtain a second content feature of the reference image sample, and coding the target generated image into the feature space through the target domain coder to obtain a third content feature of the target generated image.
The source domain encoder is used for extracting features from an image of a source domain, and embedding the extracted features into a feature space to obtain a feature vector. The target domain encoder is used for extracting features from the image of the target domain, and embedding the extracted features into the same feature space to obtain feature vectors. The feature space is used for storing feature vectors.
In a specific embodiment, the source domain encoder and the target domain encoder may be general encoders.
In one embodiment, positive and negative samples are input to a mutual information discriminator, and the image generator, the source domain encoder, the target domain encoder, and the mutual information discriminator are iteratively confronted, and mutual information of the first content feature and the third content feature is iteratively maximized during the confrontation training until an iteration stop condition is reached.
Specifically, as shown in fig. 5, fig. 5 is a block diagram of a training system of an image generator in one embodiment. And Ia is a source domain image sample, Ib is a target generation image, Ic is a reference image sample, Za is a first content characteristic, Zb is a third content characteristic, and Zc is a second content characteristic. In the countercheck training process, the image generator, the source domain encoder, the target domain encoder and the mutual information discriminator form a dynamic game process, and the game process is counterchecked and mutually promoted.
Specifically, as training progresses, under the constraints of a source domain encoder, a target domain encoder and a mutual information discriminator, the image generator generates a target generation image and a source domain image sample, and the difference of content features between the target generation image and the source domain image sample gradually becomes smaller; the source domain encoder extracts the features of the source domain image sample more and more accurately under the constraints of the image generator, the target domain encoder and the mutual information discriminator; under the constraints of the image generator, the source domain encoder and the mutual information discriminator, the target domain encoder extracts the characteristics of the target generated image and the reference image sample more and more accurately; under the constraints of an image generator, a source domain encoder and a target domain encoder, a mutual information discriminator has higher and higher classification accuracy.
In one embodiment, a source domain decoder and a target domain decoder are obtained; mapping a first content characteristic of a source domain image sample to a source domain through a source domain decoder to obtain a first reconstructed image; mapping the first content characteristics of the source domain image sample to a target domain through a target domain decoder to obtain a second reconstructed image; constructing a first loss function based on a difference between the source domain image sample and the first reconstructed image, and constructing a second loss function based on a difference between the target generated image and the second reconstructed image; inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached, wherein the method comprises the following steps: inputting the positive sample and the negative sample into a mutual information discriminator, combining the first loss function and the second loss function, performing iterative confrontation training on the image generator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
It will be appreciated that when the source domain encoder encodes the first content feature into the feature space, the source domain encoder may encode the source domain information into the feature space together; similarly, when the target encoder encodes the second content feature and the third content feature into the feature space, the target domain encoder may encode the target domain information into the feature space together. The source domain information and the target domain information may affect the training efficiency of the image generator to a certain extent. Based on this, the source domain information can be separated from the feature space by the source domain decoder, and the target domain information can be separated from the feature space by the target domain decoder.
The source domain decoder is used for reconstructing the feature vector obtained according to the source domain encoder or the target domain encoder to a source domain; and the target domain decoder is used for reconstructing the feature vector obtained by the source domain encoder or the target domain encoder to a target domain. The first reconstructed image is a real image obtained by reconstructing the first content characteristics to a source domain through a source domain decoder; the second reconstructed image is a real image obtained by reconstructing the first content features to the target domain through the target domain decoder. A first loss function is used to reduce the L1 norm between the source domain image samples and the first reconstructed image; the second loss function is used to reduce the L1 norm between the target generated image and the second reconstructed image.
Specifically, as shown in fig. 6, fig. 6 is a block diagram of a training system of an image generator in one embodiment. Wherein Ia is a source domain image sample, Ib is a target generation image, Za is a first content feature, Zaa is a first reconstruction image, and Zab is a second reconstruction image. When the source domain information is not present in the feature space, the source domain image samples may be infinitely close to the first reconstructed image and the target generated image may be infinitely close to the second reconstructed image. Taking the target generated image and the second reconstructed image as an example, the second reconstructed image is also an image of the source domain image sample corresponding to the target domain, and is only different from the generation mode of the target generated image, and the target generated image does not carry active domain information, so that when the source domain information does not exist in the feature space, the second reconstructed image is in wireless proximity to the target generated image. Therefore, a first loss function is constructed based on the difference between the source domain image sample and the first reconstructed image, and the difference between the source domain image sample and the first reconstructed image is reduced through the first loss function, so that the source domain decoder learns the source domain information; and constructing a second loss function based on a difference between the target generated image and the second reconstructed image, and reducing the difference between the target generated image and the second reconstructed image through the second loss function so that the target domain decoder learns the target domain information. In one embodiment, the third content feature of the target generated image may be mapped to the source domain by the source domain decoder, resulting in a third reconstructed image; mapping the third content characteristics of the target generated image to a target domain through a target domain decoder to obtain a fourth reconstructed image; constructing a third loss function based on the difference between the source domain image sample and the third reconstructed image, and constructing a fourth loss function based on the difference between the target generation image and the fourth reconstructed image; and inputting the positive sample and the negative sample into a mutual information discriminator, and combining a third loss function and a fourth loss function.
The third reconstructed image is a real image obtained by reconstructing the third content characteristics to the source domain through a source domain decoder; the fourth reconstructed image is a real image obtained by reconstructing the third content feature to the target domain through the target domain decoder. A third loss function is used to reduce the L1 norm between the source domain image samples and the third reconstructed image; the fourth loss function is used to reduce the L1 norm between the target generated image and the fourth reconstructed image.
Specifically, as shown in fig. 7, fig. 7 is a block diagram of a training system of an image generator in another embodiment. Wherein Ia is a source domain image sample, Ib is a target generated image, Zb is a third content feature, Zba is a third reconstructed image, and Zbb is a fourth reconstructed image. When the target domain information does not exist in the feature space, the source domain image sample can be infinitely close to the third reconstructed image, and the target generation image is infinitely close to the fourth reconstructed image. Therefore, a third loss function is constructed based on the difference between the source domain image sample and the third reconstructed image, and the difference between the source domain image sample and the third reconstructed image is reduced through the third loss function, so that the source domain decoder learns the source domain information; and constructing a fourth loss function based on the difference between the target generation image and the fourth reconstruction image, and reducing the difference between the target generation image and the fourth reconstruction image through the fourth loss function so that the target domain decoder learns the target domain information.
In one embodiment, the extracting the first content feature of the source domain image sample, the second content feature of the reference image sample, and the third content feature of the target generation image respectively includes: acquiring a source domain encoder and a target domain encoder; encoding the source domain image sample into a feature space through a source domain encoder to obtain a first encoding feature of the source domain image sample; the first encoding characteristic comprises at least a first content characteristic; encoding the reference image sample into a feature space through a target domain encoder to obtain a second encoding feature of the reference image sample; the second coding features at least comprise second content features, and the target generated image is coded into a feature space through a target domain coder to obtain third coding features of the target generated image; the third encoding characteristic includes at least a third content characteristic.
The processing method of the image generator further comprises: acquiring a source domain decoder and a target domain decoder; mapping a first coding characteristic of a source domain image sample to a source domain through a source domain decoder to obtain a first reconstructed image; mapping the first coding characteristics of the source domain image sample to a target domain through a target domain decoder to obtain a second reconstructed image; a first loss function is constructed based on differences between the source domain image samples and the first reconstructed image, and a second loss function is constructed based on differences between the target generated image and the second reconstructed image.
Inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached, wherein the method comprises the following steps: inputting the positive sample and the negative sample into a mutual information discriminator, combining a first loss function and a second loss function, carrying out iterative confrontation training on the image generator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder and the mutual information discriminator, iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process, and until the first coding characteristic only comprises the first content characteristic, the second coding characteristic only comprises the second content characteristic, the third coding characteristic only comprises the third content characteristic and an iteration stop condition is reached.
It is understood that in the early stages of iterative counter-training, the coding features resulting from the encoder coding include content features and domain features. For example, the source domain encoder encodes the source domain image samples to obtain a first encoding characteristic, which includes a first content characteristic and a source domain characteristic; for another example, the target domain encoder encodes the target-generating image to obtain a third encoding characteristic, which includes a third content characteristic and a target domain characteristic. In an iterative confrontation training process, the decoder learns to recover the domain features during the decoding process by simultaneously minimizing the first loss function and the second loss function such that the encoder learns to perform content feature dissociation, i.e., to remove the domain features, during the encoding process. In this way, at the later stage of the iterative confrontation training, the coding features obtained by the coding of the coder can only include the content features, and can be better used for the mutual information discriminator to maximize the mutual information between the content features. It should be noted that the encoding features, including only the content features, refer to the target state, allowing for errors that are tolerable in reality. Also, since the mapping between the source domain and the target domain is symmetrical, the mapping from the target domain to the source domain, the content feature dissociation during encoding, and the feature recovery during decoding can be similar to the previous embodiment.
In a specific embodiment, the source domain decoder and the target domain decoder can both adopt a universal decoder.
Specifically, in the countermeasure training process, the image generator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder and the mutual information discriminator form a dynamic game process, and the game process is mutually confronted and mutually promoted. With the training, the difference of content features between a target generated image generated by the image generator and a source domain image sample is gradually reduced, the source domain encoder extracts the features of the source domain image sample more and more accurately, the target domain encoder extracts the features of the target generated image and a reference image sample more and more accurately, the source domain decoder gradually separates source domain information from a feature space, and the target domain decoder gradually separates target domain information from the feature space, so that the mutual information discriminator has higher and higher classification accuracy.
In a particular embodiment, the second content feature may be from an image sample of the target domain while generating positive samples from the first content feature and the third content feature and generating negative samples from the first content feature and the second content feature. The first content characteristic comes from the source domain, the third content characteristic comes from the target domain, and if the second content characteristic also comes from the target domain, the interference of the domain information in the training process can be reduced. Likewise, the second content feature may be from an image sample of the source domain when generating the negative sample from the second content feature and the third content feature.
In one embodiment, an image discriminator is obtained; inputting a source domain image sample and a reference image sample into an image discriminator, inputting a positive sample and a negative sample into a mutual information discriminator, performing iterative dual-countercheck training on an image generator, the image discriminator and the mutual information discriminator, and iteratively maximizing mutual information in the countercheck training process until an iteration stop condition is reached.
Wherein the image discriminator is used for carrying out countermeasure training with the image generator.
It can be understood that because the quality of the image generated by the image generator lacks constraints, an image discriminator is introduced to improve the image quality of the target generated image based on the generation of the antagonistic frame. Fig. 8 is a block diagram showing a structure of a training system of an image generator in still another embodiment, as shown in fig. 8. The image generator generates a target generation image according to the source domain image sample, the source domain image sample and the target generation image are input into the image discriminator, and the image discriminator distinguishes the source domain image sample and the target generation image, so that the image generator and the image discriminator are trained together in continuous confrontation.
Specifically, as training progresses, under the constraint of the image discriminator and the mutual information discriminator, the difference of content features between the source domain image sample and the target generation image gradually becomes smaller; under the constraint of the image generator and the image discriminator, the mutual information discriminator has higher and higher classification accuracy.
In a specific embodiment, the image Discriminator may use a GAN (generic adaptive Networks) Discriminator (Discriminator), or the like.
In one embodiment, the image generator is a target domain image generator; the image discriminator is a target domain image discriminator; acquiring a source domain image generator and a source domain image discriminator; generating a restored image of the target generation image in the source domain by the source domain image generator; constructing cycle consistency loss of a source domain image generator, a source domain image discriminator, a target domain image generator and a target domain image discriminator; inputting a source domain image sample and a target generation image into a target domain image discriminator, inputting the source domain image sample and a recovery image into the source domain image discriminator, and inputting a positive sample and a negative sample into a mutual information discriminator, performing iterative dual countercheck training on the source domain image generator, the source domain image discriminator, the target domain image generator, the target domain image discriminator and the mutual information discriminator by combining cycle consistency loss, and iteratively maximizing mutual information in a countercheck training process until an iteration stop condition is reached.
The target domain image generator is used for migrating an image from the source domain to the target domain; the target domain image discriminator is used for discriminating the image generated by the target domain image generator; the source domain image generator is used for migrating an image from the target domain to the source domain; the source domain image discriminator is used for discriminating the image generated by the source domain image generator. The restored image is a real image of the generated target generation image in the source domain by the source domain image generator. The cycle consistency loss comprises a loss calculated by the target domain image discriminator and a loss calculated by the source domain image discriminator, and is used for optimizing parameters of the source domain image generator, the source domain image discriminator, the target domain image generator and the target domain image discriminator.
It can be understood that due to the lack of constraints on the quality of the image generated by the target domain image generator, the source domain image generator, the target domain image discriminator and the source domain image discriminator are introduced, and the quality of the target generated image is improved based on the generation countermeasure framework. As shown in fig. 9, fig. 9 is a block diagram of a training system of an image generator in a further embodiment. The target domain image generator generates a target generation image according to the source domain image sample, the source domain image sample and the target generation image are input into a target domain image discriminator, and the target domain image discriminator distinguishes the source domain image sample and the target generation image; and generating a restored image of the target generated image in the source domain by a source domain image generator, inputting the source domain image sample and the restored image into a source domain image discriminator, and distinguishing the source domain image sample from the restored image by the source domain image discriminator. In this way, the target domain image generator is trained with the source domain image generator, the target domain image discriminator, and the source domain image discriminator in a continuous confrontation.
Specifically, as training progresses, under the constraint of the source domain image generator, the target domain image discriminator, the source domain image discriminator and the mutual information discriminator, the difference of content features between the source domain image sample and the target generation image gradually becomes smaller in the target domain image generator; and the mutual information discriminator has higher and higher classification accuracy under the constraint of the target domain image generator, the source domain image generator, the target domain image discriminator and the source domain image discriminator.
In a specific embodiment, the target domain image Generator and the source domain image Generator may adopt a Generator (Generator) of CycleGAN, etc.; the target domain image Discriminator and the source domain image Discriminator may employ a CycleGAN Discriminator (Discriminator), or the like.
The image generation method comprises the steps of obtaining an image to be migrated, determining a source domain to which the image to be migrated belongs and a target domain to which the image to be migrated belongs, inquiring an image generator for migrating the image belonging to the source domain to the target domain, generating a migration image of the image to be migrated in the target domain through the image generator, wherein the content characteristics of the migration image and the image to be migrated are the same, the image generator is obtained through iterative countermeasure training with a mutual information discriminator, and iteratively maximizes a target parameter in the countermeasure training process, the target parameter is mutual information between the content characteristics of an image sample in the source domain and the content characteristics of an image generated by the target, the image sample in the source domain belongs to the source domain, and the image generated by the image generator belongs to the target domain and is generated according to the image sample in the source domain. In this way, through the countertraining between the mutual information discriminator and the image generation model, when the image generator transfers the image from the source domain to the target domain, the image of the target domain is consistent with the image of the source domain in content characteristics, so that the image deformation of the target domain is avoided.
In one embodiment, the method further comprises: acquiring a training sample set; the training sample set comprises a first medical image obtained according to a first imaging condition and a second medical image obtained according to a second imaging condition; the image domains of the medical images obtained under different imaging conditions are different; taking the first medical image as an image to be migrated; the source domain is an image domain to which the first medical image belongs; the target domain is an image domain to which the second medical image belongs; acquiring a second medical image and a migration image as an updated training sample set; and training the medical image processing model according to the updated training sample set.
The medical image processing model is a machine learning model for realizing a target function. The target function may specifically be a function of classifying the medical image, a function of segmenting the medical image, or the like. The imaging conditions may be environmental conditions such as ambient brightness, lighting, etc.; the parameters of the imaging device, such as the same fundus camera, can also be different in imaging conditions, and the obtained fundus images belong to different image domains.
It can be understood that, when the medical image processing model is trained by the training sample set, the performance of the medical image processing model is reduced due to the fact that image domains to which the image samples of the training sample set belong are different. The image samples of the training sample set are transferred to the same image domain through the image generation model, so that the image samples of the medical image processing model are in the same image domain, and the performance of the medical image processing model is improved.
In this embodiment, a first medical image obtained according to a first imaging condition and a second medical image obtained according to a second imaging condition are obtained, image domains of the medical images obtained under different imaging conditions are different, the first medical image is used as an image to be migrated, a source domain is the image domain to which the first medical image belongs, a target domain is the image domain to which the second medical image belongs, the second medical image and the migrated image are obtained as an updated training sample set, and a medical image processing model is trained according to the updated training sample set, so that the performance of the medical image processing model is improved.
In one embodiment, training a medical image processing model based on an updated training sample set includes: determining a task type of the medical image processing model; determining a training label which corresponds to each training sample in the updated training sample set and is matched with the task type; and training the medical image processing model according to the training samples and the corresponding training labels of the training samples.
The task type may include a task of classifying the medical image, a task of segmenting the medical image, and the like.
In a specific embodiment, the classification task may be to classify an imaging region corresponding to the medical image. The segmentation task may be to segment discs (optical disc) and cups (optical cups). The optic disc is the central yellow portion of the retina, which is the entry point into the output blood vessels of the retina. The optic cup is a bright central depression of variable size present on the optic disc.
In this embodiment, the task type of the medical image processing model is determined, the training labels corresponding to the training samples and matching with the task type in the updated training sample set are determined, and the medical image processing model is trained according to the training samples and the training labels corresponding to the training samples, so that the performance of the medical image processing model is improved.
It should be understood that, although the steps in the flowcharts of fig. 3, 10, 11 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 3, 10, and 11 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the steps or stages in other steps.
In one embodiment, as shown in fig. 12, there is provided a processing apparatus of an image generator, which may be a part of a computer device by using a software module or a hardware module, or a combination of the two, the apparatus specifically includes: an obtaining module 1202, a generating module 1204, an extracting module 1206, and an inputting module 1208, wherein:
an obtaining module 1202, configured to obtain an image sample set, an image generator, and a mutual information discriminator; the image sample set comprises source domain image samples and reference image samples;
a generating module 1204, configured to generate, by the image generator, a target generation image of the source domain image sample in the target domain;
an extracting module 1206, configured to extract a first content feature of the source domain image sample, a second content feature of the reference image sample, and a third content feature of the target generation image, respectively;
the generating module 1204 is further configured to generate a positive sample according to the first content feature and the third content feature, and generate a negative sample according to at least one of the first content feature and the third content feature and the second content feature;
an input module 1208, configured to input the positive sample and the negative sample into the mutual information discriminator, perform iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximize mutual information of the first content feature and the third content feature in the confrontation training process until an iteration stop condition is reached.
In one embodiment, the input module 1208 is further configured to: characterizing mutual information of the first content feature and the third content feature by relative entropy of the first content feature and the third content feature; constructing a discrimination loss function of a mutual information discriminator according to the cross entropy of the first content characteristic and the third content characteristic; the cross entropy is positively correlated with the relative entropy; inputting the positive sample and the negative sample into a mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator by combining a discriminant loss function, iteratively optimizing the discriminant loss function in the confrontation training process, and maximizing mutual information.
In one embodiment, the extraction module 1206 is further configured to: acquiring a source domain encoder and a target domain encoder; encoding the source domain image sample into a feature space through a source domain encoder to obtain a first content feature of the source domain image sample; and coding the reference image sample into the feature space through the target domain coder to obtain a second content feature of the reference image sample, and coding the target generated image into the feature space through the target domain coder to obtain a third content feature of the target generated image.
In one embodiment, the input module 1208 is further configured to: inputting the positive sample and the negative sample into a mutual information discriminator, carrying out iterative confrontation training on the image generator, the source domain encoder, the target domain encoder and the mutual information discriminator, and iteratively maximizing the mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
In one embodiment, the obtaining module 1202 is further configured to: acquiring a source domain decoder and a target domain decoder; the processing means of the image generator further comprise a mapping module for: mapping a first content characteristic of a source domain image sample to a source domain through a source domain decoder to obtain a first reconstructed image; mapping the first content characteristics of the source domain image sample to a target domain through a target domain decoder to obtain a second reconstructed image; a build module to: constructing a first loss function based on a difference between the source domain image sample and the first reconstructed image, and constructing a second loss function based on a difference between the target generated image and the second reconstructed image; an input module 1208, further configured to: inputting the positive sample and the negative sample into a mutual information discriminator, combining the first loss function and the second loss function, performing iterative confrontation training on the image generator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
In one embodiment, the obtaining module 1202 is further configured to: acquiring an image discriminator; an input module 1208, further configured to: inputting a source domain image sample and a reference image sample into an image discriminator, inputting a positive sample and a negative sample into a mutual information discriminator, performing iterative dual-countercheck training on an image generator, the image discriminator and the mutual information discriminator, and iteratively maximizing mutual information in the countercheck training process until an iteration stop condition is reached.
In one embodiment, the image generator is a target domain image generator; the image discriminator is a target domain image discriminator; an obtaining module 1202, further configured to: acquiring a source domain image generator and a source domain image discriminator; the generating module 1204 is further configured to: generating a restored image of the target generation image in the source domain by the source domain image generator; a build module further configured to: constructing cycle consistency loss of a source domain image generator, a source domain image discriminator, a target domain image generator and a target domain image discriminator; an input module 1208, further configured to: inputting a source domain image sample and a target generation image into a target domain image discriminator, inputting the source domain image sample and a recovery image into the source domain image discriminator, and inputting a positive sample and a negative sample into a mutual information discriminator, performing iterative dual countercheck training on the source domain image generator, the source domain image discriminator, the target domain image generator, the target domain image discriminator and the mutual information discriminator by combining cycle consistency loss, and iteratively maximizing mutual information in a countercheck training process until an iteration stop condition is reached.
In one embodiment, the generating module 1204 is further configured to: splicing the first content characteristic and the third content characteristic to obtain a positive sample; and splicing at least one of the first content characteristic and the third content characteristic with the second content characteristic to obtain a negative sample.
For specific limitations of the processing means of the image generator, reference may be made to the above limitations of the processing method of the image generator, which are not described herein again. The respective modules in the processing means of the above-described image generator may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
The processing device of the image generator acquires a source domain image sample, a reference image sample, an image generator and a mutual information discriminator, generating a target generation image of the source domain image sample in the target domain by an image generator, respectively extracting a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image, generating a positive sample according to the first content feature and the third content feature, generating a negative sample according to at least one of the first content feature and the third content feature and the second content feature, inputting the positive sample and the negative sample into a mutual information discriminator, and performing iterative countermeasure training on the image generator and the mutual information discriminator, and iteratively maximizing the mutual information of the first content characteristic and the third content characteristic in the countermeasure training process until an iteration stop condition is reached. In this way, through the countertraining between the mutual information discriminator and the image generator, when the image generator transfers the image from the source domain to the target domain, the image of the target domain is consistent with the image of the source domain in content characteristics, so that the image deformation of the target domain is avoided; moreover, when the target generation image is used for training the medical image processing model, the performance of the medical image processing model can be improved.
In one embodiment, as shown in fig. 13, there is provided an image generating apparatus, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, the apparatus specifically includes: an obtaining module 1302, a determining module 1304, an inquiring module 1306, and a generating module 1308, wherein:
an obtaining module 1302, configured to obtain an image to be migrated;
a determining module 1304, configured to determine a source domain to which an image to be migrated belongs and a target domain to which the image to be migrated belongs;
a query module 1306 for querying an image generator for migrating images belonging to a source domain to a target domain;
a generating module 1308, configured to generate, by the image generator, a migration image of the image to be migrated in the target domain; the content characteristics of the transferred image and the image to be transferred are the same;
the image generator is obtained through iterative confrontation training with the mutual information discriminator, and target parameters are iteratively maximized in the confrontation training process; the target parameter is mutual information between the content characteristics of the source domain image sample and the content characteristics of the target generation image; the source domain image sample belongs to a source domain; the target-generated image belongs to a target domain and is generated from the source-domain image samples by an image generator.
In one embodiment, the obtaining module 1302 is further configured to: acquiring a training sample set; the training sample set comprises a first medical image obtained according to a first imaging condition and a second medical image obtained according to a second imaging condition; the image domains of the medical images obtained under different imaging conditions are different; taking the first medical image as an image to be migrated; the source domain is an image domain to which the first medical image belongs; the target domain is an image domain to which the second medical image belongs; an obtaining module 1302, configured to: acquiring a second medical image and a migration image as an updated training sample set; the image generation apparatus further comprises a training module configured to: and training the medical image processing model according to the updated training sample set.
In one embodiment, the training module is further configured to: determining a task type of the medical image processing model; determining a training label which corresponds to each training sample in the updated training sample set and is matched with the task type; and training the medical image processing model according to the training samples and the corresponding training labels of the training samples.
For specific limitations of the image generation apparatus, reference may be made to the above limitations of the image generation method, which are not described herein again. The respective modules in the image generating apparatus described above may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
The image generation device acquires an image to be migrated, determines a source domain to which the image to be migrated belongs and a target domain to which the image to be migrated belongs, queries an image generator for migrating the image belonging to the source domain to the target domain, generates a migration image of the image to be migrated in the target domain through the image generator, and the migration image has the same content characteristics as the image to be migrated, wherein the image generator is obtained through iterative countermeasure training with a mutual information discriminator, and iteratively maximizes a target parameter in the countermeasure training process, the target parameter is mutual information between content characteristics of an image sample in the source domain and content characteristics of an image generated by the target, the image sample in the source domain belongs to the source domain, and the image generated by the image generator belongs to the target domain and is generated according to the image sample in the source domain. In this way, through the countertraining between the mutual information discriminator and the image generator, when the image generator transfers the image from the source domain to the target domain, the image of the target domain is consistent with the image of the source domain in content characteristics, so that the image deformation of the target domain is avoided; moreover, when the target generation image is used for training the medical image processing model, the performance of the medical image processing model can be improved.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 14. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing training data and/or image generation data of the image generator. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a processing method of an image generator and/or an image generation method.
Those skilled in the art will appreciate that the architecture shown in fig. 14 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (24)

1. A method of processing by an image generator, the method comprising:
acquiring an image sample set, an image generator and a mutual information discriminator; the set of image samples comprises source domain image samples and reference image samples; the reference image sample is any image sample except the source domain image sample in a source domain or an image sample of a target domain;
generating, by the image generator, a target generation image of the source domain image sample in a target domain;
respectively extracting a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image;
generating a positive sample according to the first content feature and the third content feature, and generating a negative sample according to the second content feature and at least one of the first content feature and the third content feature;
inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
2. The method of claim 1, wherein inputting the positive sample and the negative sample into the mutual information discriminator, iteratively competing the image generator and the mutual information discriminator, and iteratively maximizing the mutual information of the first content feature and the third content feature during competing training until an iteration stop condition is reached comprises:
characterizing mutual information of the first content feature and the third content feature by relative entropy of the first content feature and the third content feature;
constructing a discrimination loss function of the mutual information discriminator according to the cross entropy of the first content characteristic and the third content characteristic; the cross entropy is positively correlated with the relative entropy;
inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator by combining the discriminant loss function, iteratively optimizing the discriminant loss function in the confrontation training process, and maximizing the mutual information.
3. The method of claim 1, wherein the separately extracting a first content feature of the source domain image sample, a second content feature of the reference image sample, and a third content feature of the target generation image comprises:
acquiring a source domain encoder and a target domain encoder;
encoding the source domain image sample into a feature space through the source domain encoder to obtain a first content feature of the source domain image sample;
and encoding the reference image sample into the feature space through the target domain encoder to obtain a second content feature of the reference image sample, and encoding the target generated image into the feature space through the target domain encoder to obtain a third content feature of the target generated image.
4. The method of claim 3, wherein inputting the positive sample and the negative sample into the mutual information discriminator, iteratively competing the image generator and the mutual information discriminator, and iteratively maximizing the mutual information of the first content feature and the third content feature during competing training until an iteration stop condition is reached comprises:
inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative confrontation training on the image generator, the source domain encoder, the target domain encoder and the mutual information discriminator, and iteratively maximizing the mutual information of the first content feature and the third content feature in the confrontation training process until an iteration stop condition is reached.
5. The method of claim 3, further comprising:
acquiring a source domain decoder and a target domain decoder;
mapping the first content characteristics of the source domain image samples to a source domain through the source domain decoder to obtain a first reconstructed image; mapping the first content characteristics of the source domain image sample to the target domain through the target domain decoder to obtain a second reconstructed image;
constructing a first loss function based on differences between the source domain image samples and the first reconstructed image, and constructing a second loss function based on differences between the target generated image and the second reconstructed image;
the inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing the mutual information of the first content feature and the third content feature in the confrontation training process until an iteration stop condition is reached includes:
inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative confrontation training on the image generator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder and the mutual information discriminator by combining the first loss function and the second loss function, and iteratively maximizing the mutual information of the first content feature and the third content feature in the confrontation training process until an iteration stop condition is reached.
6. The method of claim 1, further comprising:
acquiring an image discriminator;
the inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing the mutual information of the first content feature and the third content feature in the confrontation training process until an iteration stop condition is reached includes:
inputting the source domain image sample and the reference image sample into the image discriminator, inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative dual-countermeasure training on the image generator, the image discriminator and the mutual information discriminator, and iteratively maximizing the mutual information in the countermeasure training process until an iteration stop condition is reached.
7. The method of claim 6, wherein the image generator is a target domain image generator; the image discriminator is a target domain image discriminator; the method further comprises the following steps:
acquiring a source domain image generator and a source domain image discriminator;
generating a restored image of the target generation image in a source domain by the source domain image generator;
constructing a cycle consistency loss of the source domain image generator, the source domain image discriminator, the target domain image generator and the target domain image discriminator;
the inputting the source domain image sample and the reference image sample into the image discriminator and the positive sample and the negative sample into the mutual information discriminator, performing iterative dual-countermeasure training on the image generator, the image discriminator and the mutual information discriminator, and iteratively maximizing the mutual information in the countermeasure training process until an iteration stop condition is reached, includes:
inputting the source domain image sample and the target generation image into the target domain image discriminator, inputting the source domain image sample and the recovery image into the source domain image discriminator, and inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative dual countercheck training on the source domain image generator, the source domain image discriminator, the target domain image generator, the target domain image discriminator and the mutual information discriminator by combining the cycle consistency loss, and iteratively maximizing the mutual information in a countercheck training process until an iteration stop condition is reached.
8. The method of claim 1, wherein generating positive samples according to the first content feature and the third content feature and generating the negative samples according to the second content feature and at least one of the first content feature and the third content feature comprises:
splicing the first content characteristic and the third content characteristic to obtain the positive sample;
and splicing at least one of the first content characteristic and the third content characteristic with the second content characteristic to obtain the negative sample.
9. An image generation method, characterized in that the method comprises:
acquiring an image to be migrated;
determining a source domain to which the image to be migrated belongs and a target domain to which the image to be migrated belongs;
querying an image generator for migrating images belonging to the source domain to the target domain;
generating a migration image of the image to be migrated in the target domain through the image generator; the content characteristics of the transferred image and the image to be transferred are the same;
the image generator is obtained through iterative confrontation training with the mutual information discriminator, and target parameters are iteratively maximized in the confrontation training process; the target parameters are mutual information between the content characteristics of the source domain image samples and the content characteristics of the generated image samples; the source domain image sample belongs to the source domain; the generated image sample belongs to the target domain and is generated by the image generator from the source domain image sample;
wherein the step of iteratively competing training the image generator and the mutual information arbiter comprises: extracting a first content feature of the source domain image sample, a second content feature of a reference image sample and a third content feature of the generated image sample; the reference image sample is any image sample except the source domain image sample in a source domain or an image sample of a target domain; generating a positive sample according to the first content feature and the third content feature, and generating a negative sample according to the second content feature and at least one of the first content feature and the third content feature; inputting the positive sample and the negative sample into the mutual information discriminator, and performing iterative confrontation training on the image generator and the mutual information discriminator.
10. The method of claim 9, further comprising:
acquiring a training sample set; the training sample set comprises a first medical image obtained according to a first imaging condition and a second medical image obtained according to a second imaging condition; the image domains of the medical images obtained under different imaging conditions are different;
taking the first medical image as an image to be migrated; the source domain is an image domain to which the first medical image belongs; the target domain is an image domain to which the second medical image belongs;
acquiring the second medical image and the migration image as an updated training sample set; the transferred image is an image obtained by transferring the first medical image to the target domain;
and training the medical image processing model according to the updated training sample set.
11. The method of claim 10, wherein training a medical image processing model based on the updated training sample set comprises:
determining a task type of the medical image processing model;
determining a training label corresponding to each training sample in the updated training sample set and matched with the task type;
and training the medical image processing model according to the training samples and the corresponding training labels of the training samples.
12. A processing apparatus of an image generator, characterized in that the apparatus comprises:
the acquisition module is used for acquiring the image sample set, the image generator and the mutual information discriminator; the set of image samples comprises source domain image samples and reference image samples; the reference image sample is any image sample except the source domain image sample in a source domain or an image sample of a target domain;
a generating module, configured to generate, by the image generator, a target generation image of the source domain image sample in a target domain;
the extraction module is used for respectively extracting a first content feature of the source domain image sample, a second content feature of the reference image sample and a third content feature of the target generation image;
the generating module is further configured to generate a positive sample according to the first content feature and the third content feature, and generate a negative sample according to the second content feature and at least one of the first content feature and the third content feature;
and the input module is used for inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator, and iteratively maximizing the mutual information of the first content characteristic and the third content characteristic in the confrontation training process until an iteration stop condition is reached.
13. The apparatus of claim 12, wherein the input module is further configured to characterize mutual information of the first content feature and the third content feature by relative entropy of the first content feature and the third content feature; constructing a discrimination loss function of the mutual information discriminator according to the cross entropy of the first content characteristic and the third content characteristic; the cross entropy is positively correlated with the relative entropy; inputting the positive sample and the negative sample into the mutual information discriminator, performing iterative confrontation training on the image generator and the mutual information discriminator by combining the discriminant loss function, iteratively optimizing the discriminant loss function in the confrontation training process, and maximizing the mutual information.
14. The apparatus of claim 12, wherein the extracting module is further configured to obtain a source domain encoder and a target domain encoder; encoding the source domain image sample into a feature space through the source domain encoder to obtain a first content feature of the source domain image sample; and encoding the reference image sample into the feature space through the target domain encoder to obtain a second content feature of the reference image sample, and encoding the target generated image into the feature space through the target domain encoder to obtain a third content feature of the target generated image.
15. The apparatus of claim 14, wherein the input module is further configured to input the positive sample and the negative sample into the mutual information discriminator, perform iterative confrontation training on the image generator, the source domain encoder, the target domain encoder, and the mutual information discriminator, and iteratively maximize mutual information of the first content feature and the third content feature during the confrontation training until an iteration stop condition is reached.
16. The apparatus of claim 14, further comprising a mapping module and a building module;
the acquisition module is further used for acquiring a source domain decoder and a target domain decoder;
the mapping module is configured to map, by the source domain decoder, the first content feature of the source domain image sample to a source domain to obtain a first reconstructed image; mapping the first content characteristics of the source domain image sample to the target domain through the target domain decoder to obtain a second reconstructed image;
the construction module is configured to construct a first loss function based on a difference between the source domain image sample and the first reconstructed image, and to construct a second loss function based on a difference between the target generated image and the second reconstructed image;
the input module is further configured to input the positive sample and the negative sample to the mutual information discriminator, perform iterative confrontation training on the image generator, the source domain encoder, the target domain encoder, the source domain decoder, the target domain decoder, and the mutual information discriminator by combining the first loss function and the second loss function, and iteratively maximize the mutual information of the first content feature and the third content feature in the confrontation training process until an iteration stop condition is reached.
17. The apparatus of claim 12,
the acquisition module is also used for acquiring an image discriminator;
the input module is further configured to input the source domain image sample and the reference image sample into the image discriminator, input the positive sample and the negative sample into the mutual information discriminator, perform iterative dual-countermeasure training on the image generator, the image discriminator and the mutual information discriminator, and iteratively maximize the mutual information in a countermeasure training process until an iteration stop condition is reached.
18. The apparatus of claim 17, wherein the image generator is a target domain image generator; the image discriminator is a target domain image discriminator;
the acquisition module is also used for acquiring a source domain image generator and a source domain image discriminator;
the generating module is further used for generating a restored image of the target generating image in the source domain through the source domain image generator;
a construction module, further configured to construct a cycle consistency loss of the source domain image generator, the source domain image discriminator, the target domain image generator, and the target domain image discriminator;
the input module is further configured to input the source domain image sample and the target generation image into the target domain image discriminator, input the source domain image sample and the recovery image into the source domain image discriminator, and input the positive sample and the negative sample into the mutual information discriminator, perform iterative dual-countermeasure training on the source domain image generator, the source domain image discriminator, the target domain image generator, the target domain image discriminator, and the mutual information discriminator in combination with the loop consistency loss, and iteratively maximize the mutual information in a countermeasure training process until an iteration stop condition is reached.
19. The apparatus of claim 12, wherein the generating module is further configured to concatenate the first content feature and the third content feature to obtain the positive sample; and splicing at least one of the first content characteristic and the third content characteristic with the second content characteristic to obtain the negative sample.
20. An image generating apparatus, characterized in that the apparatus comprises
The acquisition module is used for acquiring an image to be migrated;
the determining module is used for determining a source domain to which the image to be migrated belongs and a target domain to which the image to be migrated belongs;
a query module for querying an image generator for migrating an image belonging to the source domain to the target domain;
the generation module is used for generating a migration image of the image to be migrated in the target domain through the image generator; the content characteristics of the transferred image and the image to be transferred are the same;
the image generator is obtained through iterative confrontation training with the mutual information discriminator, and target parameters are iteratively maximized in the confrontation training process; the target parameters are mutual information between the content characteristics of the source domain image samples and the content characteristics of the generated image samples; the source domain image sample belongs to the source domain; the generated image sample belongs to the target domain and is generated by the image generator from the source domain image sample;
wherein the step of iteratively competing training the image generator and the mutual information arbiter comprises: extracting a first content feature of the source domain image sample, a second content feature of a reference image sample and a third content feature of the generated image sample; the reference image sample is any image sample except the source domain image sample in a source domain or an image sample of a target domain; generating a positive sample according to the first content feature and the third content feature, and generating a negative sample according to the second content feature and at least one of the first content feature and the third content feature; inputting the positive sample and the negative sample into the mutual information discriminator, and performing iterative confrontation training on the image generator and the mutual information discriminator.
21. The apparatus of claim 20,
the acquisition module is further used for acquiring a training sample set; the training sample set comprises a first medical image obtained according to a first imaging condition and a second medical image obtained according to a second imaging condition; the image domains of the medical images obtained under different imaging conditions are different; taking the first medical image as an image to be migrated; the source domain is an image domain to which the first medical image belongs; the target domain is an image domain to which the second medical image belongs;
the acquisition module is further configured to acquire the second medical image and the migration image as an updated training sample set; the transferred image is an image obtained by transferring the first medical image to the target domain;
the apparatus also includes a training module for training a medical image processing model according to the updated training sample set.
22. The apparatus of claim 21, wherein the training module is further configured to determine a task type of the medical image processing model; determining a training label corresponding to each training sample in the updated training sample set and matched with the task type; and training the medical image processing model according to the training samples and the corresponding training labels of the training samples.
23. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 11 when executing the computer program.
24. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 11.
CN202010392503.3A 2020-05-11 2020-05-11 Processing method of image generator, image generation method and device Active CN111597946B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010392503.3A CN111597946B (en) 2020-05-11 2020-05-11 Processing method of image generator, image generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010392503.3A CN111597946B (en) 2020-05-11 2020-05-11 Processing method of image generator, image generation method and device

Publications (2)

Publication Number Publication Date
CN111597946A CN111597946A (en) 2020-08-28
CN111597946B true CN111597946B (en) 2022-04-08

Family

ID=72183601

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010392503.3A Active CN111597946B (en) 2020-05-11 2020-05-11 Processing method of image generator, image generation method and device

Country Status (1)

Country Link
CN (1) CN111597946B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112686205B (en) * 2021-01-14 2023-10-13 电子科技大学中山学院 Parameter updating method and device and multi-terminal network architecture
CN112633425B (en) * 2021-03-11 2021-05-11 腾讯科技(深圳)有限公司 Image classification method and device
CN113080990B (en) * 2021-03-25 2023-01-10 南京蝶谷健康科技有限公司 Heart beat anomaly detection method based on CycleGAN and BilSTM neural network method
CN113393386B (en) * 2021-05-18 2022-03-01 电子科技大学 Non-paired image contrast defogging method based on feature decoupling
CN113435365B (en) * 2021-06-30 2022-08-16 平安科技(深圳)有限公司 Face image migration method and device
CN114255502B (en) * 2021-12-23 2024-03-29 中国电信股份有限公司 Face image generation method and device, face recognition method, equipment and medium
CN114882220B (en) * 2022-05-20 2023-02-28 山东力聚机器人科技股份有限公司 Domain-adaptive priori knowledge-based GAN (generic object model) image generation method and system
CN115311526B (en) * 2022-10-11 2023-04-07 江苏智云天工科技有限公司 Defect sample generation method and system based on improved Cycle GAN network

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7653264B2 (en) * 2005-03-04 2010-01-26 The Regents Of The University Of Michigan Method of determining alignment of images in high dimensional feature space
EP3182413B1 (en) * 2015-12-16 2018-08-29 Ruhr-Universität Bochum Adaptive line enhancer based method
KR102403494B1 (en) * 2017-04-27 2022-05-27 에스케이텔레콤 주식회사 Method for learning Cross-domain Relations based on Generative Adversarial Network
CN107609116B (en) * 2017-09-13 2020-09-18 星环信息科技(上海)有限公司 Method and equipment for creating cross-domain migration deep network
CN108171320B (en) * 2017-12-06 2021-10-19 西安工业大学 Image domain conversion network and conversion method based on generative countermeasure network
CN108665058B (en) * 2018-04-11 2021-01-05 徐州工程学院 Method for generating countermeasure network based on segment loss
CN108682022B (en) * 2018-04-25 2020-11-24 清华大学 Visual tracking method and system based on anti-migration network
CN109214421B (en) * 2018-07-27 2022-01-28 创新先进技术有限公司 Model training method and device and computer equipment
CN109447895B (en) * 2018-09-03 2021-06-08 腾讯科技(武汉)有限公司 Picture generation method and device, storage medium and electronic device
CN109753992B (en) * 2018-12-10 2020-09-01 南京师范大学 Unsupervised domain adaptive image classification method based on condition generation countermeasure network
CN109740682B (en) * 2019-01-08 2020-07-28 南京大学 Image identification method based on domain transformation and generation model
CN109745062B (en) * 2019-01-30 2020-01-10 腾讯科技(深圳)有限公司 CT image generation method, device, equipment and storage medium
CN110322446B (en) * 2019-07-01 2021-02-19 华中科技大学 Domain self-adaptive semantic segmentation method based on similarity space alignment
CN110414462B (en) * 2019-08-02 2022-02-08 中科人工智能创新技术研究院(青岛)有限公司 Unsupervised cross-domain pedestrian re-identification method and system
CN110598765B (en) * 2019-08-28 2023-05-26 腾讯科技(深圳)有限公司 Sample generation method, device, computer equipment and storage medium
CN110728295B (en) * 2019-09-02 2022-05-24 深圳中科保泰空天技术有限公司 Semi-supervised landform classification model training and landform graph construction method

Also Published As

Publication number Publication date
CN111597946A (en) 2020-08-28

Similar Documents

Publication Publication Date Title
CN111597946B (en) Processing method of image generator, image generation method and device
CN111199550B (en) Training method, segmentation method, device and storage medium of image segmentation network
CN113077471B (en) Medical image segmentation method based on U-shaped network
US10929708B2 (en) Deep learning network for salient region identification in images
CN107563434B (en) Brain MRI image classification method and device based on three-dimensional convolutional neural network
CN110689025A (en) Image recognition method, device and system, and endoscope image recognition method and device
CN111932529B (en) Image classification and segmentation method, device and system
CN112488976B (en) Multi-modal medical image fusion method based on DARTS network
CN111242948B (en) Image processing method, image processing device, model training method, model training device, image processing equipment and storage medium
CN113822289A (en) Training method, device and equipment of image noise reduction model and storage medium
CN115880720A (en) Non-labeling scene self-adaptive human body posture and shape estimation method based on confidence degree sharing
Wang et al. Left ventricle landmark localization and identification in cardiac MRI by deep metric learning-assisted CNN regression
CN113592769B (en) Abnormal image detection and model training method, device, equipment and medium
Chatterjee et al. A survey on techniques used in medical imaging processing
CN112233017B (en) Method for enhancing pathological face data based on generation countermeasure network
Mahapatra Registration of histopathogy images using structural information from fine grained feature maps
CN113822323A (en) Brain scanning image identification processing method, device, equipment and storage medium
CN113724185A (en) Model processing method and device for image classification and storage medium
Aguirre Nilsson et al. Classification of ulcer images using convolutional neural networks
Tawfeeq et al. Predication of Most Significant Features in Medical Image by Utilized CNN and Heatmap.
Kobayashi et al. Learning global and local features of normal brain anatomy for unsupervised abnormality detection
CN116109655B (en) Image encoder processing method and device and image segmentation method
Li et al. An optimization r-cnn method for Ovarian cyst detection
CN117649422B (en) Training method of multi-modal image segmentation model and multi-modal image segmentation method
CN113538451B (en) Method and device for segmenting magnetic resonance image of deep vein thrombosis, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40027932

Country of ref document: HK

TA01 Transfer of patent application right

Effective date of registration: 20210927

Address after: 518000 Room 201, building A, 1 front Bay Road, Shenzhen Qianhai cooperation zone, Shenzhen, Guangdong

Applicant after: Tencent Medical Health (Shenzhen) Co.,Ltd.

Address before: 518000 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 Floors

Applicant before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant