CN111402179B - Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network - Google Patents

Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network Download PDF

Info

Publication number
CN111402179B
CN111402179B CN202010169306.5A CN202010169306A CN111402179B CN 111402179 B CN111402179 B CN 111402179B CN 202010169306 A CN202010169306 A CN 202010169306A CN 111402179 B CN111402179 B CN 111402179B
Authority
CN
China
Prior art keywords
image
discriminator
loss function
countermeasure
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202010169306.5A
Other languages
Chinese (zh)
Other versions
CN111402179A (en
Inventor
张桂梅
胡强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang Hangkong University
Original Assignee
Nanchang Hangkong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang Hangkong University filed Critical Nanchang Hangkong University
Priority to CN202010169306.5A priority Critical patent/CN111402179B/en
Publication of CN111402179A publication Critical patent/CN111402179A/en
Application granted granted Critical
Publication of CN111402179B publication Critical patent/CN111402179B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30101Blood vessel; Artery; Vein; Vascular

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Eye Examination Apparatus (AREA)

Abstract

The invention discloses an image synthesis method and system combining a countermeasure autoencoder and a countermeasure network generation. The method includes constructing an enhanced countermeasure automatic encoder including two different sets of encoders, two different sets of first discriminators, and a set of decoders; constructing an improved conditional generation countermeasure network comprising a generator and a second discriminator; taking the manually segmented blood vessel tree image and the original fundus retina image as training data, and performing iterative training on a combined enhanced countermeasure automatic encoder and an improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator; and performing fundus retina image synthesis on the to-be-processed artificial segmentation blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthesized image. The invention can generate sample data with higher precision and more diversified styles and effectively amplify limited training sample data.

Description

Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network
Technical Field
The present invention relates to the field of image processing, and more particularly, to an image synthesis method and system for combining a countermeasure autoencoder and a generation countermeasure network.
Background
In the field of medical application, a medical image processing algorithm with superior performance needs a large amount of effective medical image data with specific labeling information as training sample data. In the actual treatment process, medical equipment has the reasons that the human health is damaged by large radiation, the privacy of a patient is involved, and the like, so that the direct acquisition of medical image data is difficult, and medical image data samples of different organs of a human body are very rare; meanwhile, the complexity of the focus caused by different human organs is different, the difficulty of manually marking the effective characteristic information label is very high, and the cost is also very high. Therefore, in order to promote the improvement of medical technology level such as medical image segmentation, medical image recognition and medical image registration, the synthesis of medical image amplification sample data set is receiving a lot of attention.
By integrating the current research situation of medical image data amplification technology at home and abroad, the related research methods can be divided into two categories: the method is based on a traditional medical image amplification method and a deep learning-based medical image amplification method. The traditional amplification method comprises rigid transformation, such as translation, turnover, rotation, scaling, affine transformation and the like; non-rigid transformations, such as elastic transformations; the rigid transformation is to transform the image by setting a plurality of fixed transformation mapping matrixes, and finally the obtained amplified data set only comprises a plurality of preset corresponding transformation relations; the non-rigid transformation is that a corresponding constraint range is set for pixel points, such as the rotation angle of a local pixel point, the translation distance of the pixel point and the like, although the data set can be amplified, the diversity change can also be presented, but the transformed image does not have a specific evaluation standard, the control data is difficult to be transformed to a reasonable range, and the obtained amplified data is unstable. Although the method based on traditional medical image amplification can amplify limited data samples in quantity, the diversity of the expanded data is not changed greatly essentially, for example, the rotation in the true sense is according to a certain angle, each pixel point in the image is rotated in 3D, and not only the image itself is rotated according to the angle.
In order to solve the problems and limitations existing in the conventional data amplification method, in 2014, the related scholars propose a generation confrontation network synthesis method for generating amplification sample data similar to a real data set in pixel gray scale and structure, without depending on a large number of labeled samples during training, wherein the synthesis method generates the optimal output by mutually gaming a generator network and a discriminator network. However, the actual training difficulty for generating the countermeasure network is extremely high, and the problems of gradient loss, overfitting and the like are easy to occur. To alleviate this problem, the relevant scholars propose a method of generating a countermeasure network by least squares, and construct a loss function by a least square method with a more strict convergence condition. In order to enable the generation of the confrontation network synthesis method to generate the required synthesis sample more purposefully and more efficiently, the related scholars propose a conditional generation confrontation network synthesis method, and the synthesis of the image is realized by providing a small amount of label information for a training model.
In summary, although the traditional medical data amplification method can increase the limited data set in quantity, the amplification image lacks diversity variation; although the medical image amplification method based on deep learning can amplify limited training samples in quantity and diversity, the training process of the synthetic network lacks stability, and a synthetic model needs to be further optimized.
Disclosure of Invention
Based on this, it is necessary to provide an image synthesis method and system combining an antagonistic self-encoder and generation of an antagonistic network, which can perform sample data amplification on a limited medical image data set, enrich a medical image database, improve the stability of a training process of a synthetic network, and improve the precision and generalization performance of image synthesis, thereby promoting the precision and generalization performance improvement of medical technologies such as medical image segmentation, medical image registration, medical image recognition and the like.
In order to achieve the purpose, the invention provides the following scheme:
an image synthesis method incorporating a countermeasure autoencoder and generating a countermeasure network, comprising:
constructing an enhanced countermeasure autoencoder; the enhanced countermeasure automatic encoder comprises two groups of encoders of different classes, two groups of first discriminators of different classes and a group of decoders; the two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for manually segmenting the blood vessel tree image according to input information to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image;
constructing an improved conditional generation countermeasure network; the improved conditional generation countermeasure network comprises a generator and a second discriminator; the generator is used for generating a reconstructed fundus retina image according to the artificial segmentation retina outline mask and the reconstructed blood vessel tree image output by the structure enhanced countermeasure automatic encoder; the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training;
taking the artificially segmented blood vessel tree image and the original fundus retina image as training data, and performing iterative training on the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator;
and performing fundus retina image synthesis on the to-be-processed artificial segmentation blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthesized image.
Optionally, the constructing an enhanced countermeasure automatic encoder specifically includes:
constructing a content encoder; the content encoder comprises a first down-sampling layer and a first residual block network which are connected in sequence; the first down-sampling layer comprises a convolution layer and a first convolution block; the first network of residual blocks comprises a plurality of first standard residual blocks; the convolution layer comprises a convolution kernel and an activation function layer; the first convolution block comprises a convolution kernel, an instance normalization layer and an activation function layer; the first standard residual block comprises two first convolution blocks connected in a jumping-over manner;
constructing a style encoder; the style encoder comprises a second down-sampling layer, a global average pooling layer and a full-connection layer which are connected in sequence; the second downsampling layer includes a plurality of the convolutional layers;
constructing a content coding discriminator; the content coding discriminator comprises a plurality of first convolution blocks which are connected in sequence;
constructing a style code discriminator; the style coding discriminator comprises a plurality of second volume blocks which are connected in sequence;
constructing a decoder; the decoder comprises a second residual block network, a multilayer perceptron and an upsampling layer; the second network of residual blocks comprises a plurality of first standard residual blocks; the first standard residual block comprises two second convolution blocks connected in a jumping-over manner; said second convolution block includes a convolution layer, an adaptive instance normalization layer, and an activation function layer; the multilayer perceptron is used for outputting self-adaptive example normalized style parameters corresponding to the style coding vectors; the up-sampling layer comprises an anti-convolution layer and an activation function layer; the number of upsampling layers matches the sum of the number of first downsampling layers and the number of second downsampling layers.
Optionally, constructing the improved conditional generation countermeasure network specifically includes:
constructing a generator; the generator comprises a convolution layer, a deconvolution layer and a channel-by-channel full connection layer, and the connection mode of the generator is jump connection; the number of the convolution layers is matched with that of the deconvolution layers; the convolution layer comprises a convolution kernel and an activation function layer; the deconvolution layer comprises a deconvolution kernel and an activation function layer;
constructing a second discriminator; the second discriminator comprises a convolution layer, a full connection layer and a binary layer corresponding to the discriminator; and the convolution layer corresponding to the discriminator comprises a combined batch normalization layer and an activation function layer.
Optionally, the performing iterative training on the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network by using the artificially segmented blood vessel tree image and the original fundus retinal image as training data to obtain an optimal blood vessel tree image generator and an optimal fundus retinal image generator specifically includes:
acquiring an artificial segmentation blood vessel tree image and an original fundus retina image to obtain training data; the artificial segmented blood vessel tree image corresponds to the original fundus retina image;
respectively taking the manually segmented vessel tree image as the input of the content encoder and the style encoder to obtain a content coding vector and a style coding vector, and taking the content coding vector and the style coding vector as the input of the decoder to obtain a reconstructed vessel tree image;
taking the content coding vector and the prior content coding vector as the input of the content coding discriminator to distinguish, and taking the style coding vector and the prior style coding vector as the input of the style coding discriminator to distinguish;
constructing an encoder reconstruction loss function and a first discriminator countervailing loss function according to the artificially segmented vessel tree image and the reconstructed vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is a confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder;
using the artificial segmented retina outer contour mask, the reconstructed blood vessel tree image, the artificial segmented blood vessel tree image and the original fundus retina image as the input of the generator to generate a reconstructed fundus retina image;
taking the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificial segmented blood vessel tree image and the original fundus retina image as input of a second discriminator to carry out judgment;
constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates a confrontation loss function for the second arbiter in the confrontation network for the improved conditional; the generator global consistency loss function is a global consistency loss function of the generator in the countermeasure network generated by the improved conditional expression;
obtaining a total loss function of a synthetic model according to the reconstruction loss function of the encoder, the countermeasure loss function of the first discriminator, the countermeasure loss function of the second discriminator and the global consistency loss function of the generator;
performing combined training on the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation mode according to the total loss function of the synthetic model, so that parameters in the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator; the optimal vascular tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator is a trained improved conditional generation countermeasure network.
Optionally, the obtaining a total loss function of a synthetic model according to the encoder reconstruction loss function, the first discriminator countermeasure loss function, the second discriminator countermeasure loss function, and the generator global consistency loss function specifically includes:
obtaining a total loss function of the confrontation automatic encoder according to the encoder reconstruction loss function and the first discriminator confrontation loss function;
obtaining a total loss function of the generated countermeasure network according to the countermeasure loss function of the second discriminator and the global consistency loss function of the generator;
and linearly adding the total loss function of the countermeasure automatic encoder and the total loss function of the generated countermeasure network to obtain a total loss function of the synthetic model.
Optionally, the antagonistic autoencoder total loss function is
Figure BDA0002408608680000051
Wherein,
Figure BDA0002408608680000052
to combat the auto-encoder total loss function;
Figure BDA0002408608680000053
in order to be a content encoding discriminator,
Figure BDA0002408608680000054
as a style code discriminator, Q c Being a content encoder, Q s Is a style encoder, and R is a decoder;
Figure BDA0002408608680000055
for the corresponding countermeasures of the content encoding discriminator,
Figure BDA0002408608680000056
the corresponding countermeasure loss of the style coding discriminator; l is Recon Reconstructing losses for the encoder; the first discriminator includes
Figure BDA0002408608680000057
And
Figure BDA0002408608680000058
λ 0 for countering the weight parameter, λ, corresponding to the total loss function of the automatic encoder 0 For balancing countermeasure and reconstruction losses;
Figure BDA0002408608680000059
wherein v represents an artificially segmented vessel tree image;
Figure BDA0002408608680000061
representing the expected value, P, corresponding to the image of the artificially segmented vessel tree data (v) Data distribution representing an image of an artificially segmented vessel tree; q c (v) content encoder with input v, Q s (v) a style encoder for inputting an image of the artificially segmented vessel tree;
Figure BDA0002408608680000062
Figure BDA0002408608680000063
wherein z is c Representing content-encoded vectors, z s Representing a stylized code vector, P (z) c ) Representing a prior distribution, P (z), of artificially added content latent variables s ) Representing the prior distribution of the artificially added style latent variable;
Figure BDA0002408608680000064
indicating the expected value to which the content encoding vector corresponds,
Figure BDA0002408608680000065
representing expected values corresponding to the style code vectors; q c (z c | v) represents a content encoding distribution function, Q s (z s | v) represents a stylistic code distribution function, Q c (z c | v) for obtaining a content-encoded vector, Q s (z s | v) is used for obtaining the style encoding vector;
Figure BDA0002408608680000066
is input as z c The content encoding discriminator of (1),
Figure BDA0002408608680000067
is input as z s The style code discriminator of (1);
Figure BDA0002408608680000068
as an input of Q c (z c | v) content encoding discriminator,
Figure BDA0002408608680000069
as an input of Q s (z s | v) style encoding discriminator.
Optionally, the generating of the total loss function of the countermeasure network is
Figure BDA00024086086800000610
Wherein L is im2im (G, D) generating an antagonistic network total loss function; g represents a generator in the improved conditional generation countermeasure network, and D represents a second discriminator in the improved conditional generation countermeasure network; l is adv (G, D) represents a second discriminator opposition loss function;
Figure BDA00024086086800000611
representing a generator global consistency loss function;
Figure BDA00024086086800000612
representing expected values corresponding to the artificial segmentation blood vessel tree image and the original fundus retina image; v denotes an artificially segmented blood vessel tree image, r denotes an original fundus retina image,
Figure BDA00024086086800000613
representing a reconstructed blood vessel tree image, wherein m is a retina outline mask; lambda [ alpha ] 1 To generate a weight parameter, λ, corresponding to the overall loss function of the reactive network 1 The global consistency loss of the generator and the countermeasure loss of the second discriminator are balanced;
Figure BDA00024086086800000614
is expressed as input
Figure BDA00024086086800000615
And a generator of m;
Figure BDA00024086086800000616
wherein,
Figure BDA00024086086800000617
representing the expected value corresponding to the reconstructed blood vessel tree image; d (m, (v, r)) represents a second discriminator with inputs m, v, and r;
Figure BDA00024086086800000618
the input is m,
Figure BDA00024086086800000619
And
Figure BDA00024086086800000620
the second discriminator of (1);
Figure BDA00024086086800000621
representing inputs as m and
Figure BDA00024086086800000622
the generator of (1).
Optionally, the total loss function of the synthetic model is
Figure BDA0002408608680000071
Wherein,
Figure BDA0002408608680000072
representing the composite model total loss function.
The present invention also provides an image synthesis system incorporating a countermeasure autoencoder and a generation countermeasure network, comprising:
an enhanced countermeasure autoencoder construction module for constructing an enhanced countermeasure autoencoder; the enhanced countermeasure automatic encoder comprises two groups of encoders of different classes, two groups of first discriminators of different classes and a group of decoders; the two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for segmenting the blood vessel tree image according to input manual work to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image;
the improved conditional generation countermeasure network construction module is used for constructing an improved conditional generation countermeasure network; the improved conditional generation countermeasure network comprises a generator and a second discriminator; the generator is used for generating a reconstructed fundus retina image according to the artificial segmentation retina outline mask and the reconstructed blood vessel tree image output by the structure enhanced countermeasure automatic encoder; the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training;
the model training module is used for taking the artificially segmented blood vessel tree image and the original fundus retina image as training data, and performing iterative training on the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator;
and the synthetic image determining module is used for carrying out fundus retina image synthesis on the to-be-processed artificial segmentation blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthetic image.
Optionally, the model training module specifically includes:
the training data acquisition unit is used for acquiring an artificial segmentation blood vessel tree image and an original fundus retina image to obtain training data; the artificial segmented blood vessel tree image corresponds to the original fundus retina image;
the vessel tree image reconstruction unit is used for respectively taking the artificially segmented vessel tree image as the input of the content encoder and the style encoder to obtain a content coding vector and a style coding vector, and taking the content coding vector and the style coding vector as the input of the decoder to obtain a reconstructed vessel tree image;
a coding judgment unit configured to perform discrimination by using the content coding vector and the prior content coding vector as inputs of the content coding discriminator, and perform discrimination by using the style coding vector and the prior style coding vector as inputs of the style coding discriminator;
an enhanced countermeasure automatic encoder loss function establishing unit, configured to construct an encoder reconstruction loss function and a first discriminator countermeasure loss function according to the artificially segmented vessel tree image and the reconstructed vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is a confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder;
a fundus retina image reconstruction unit, configured to generate a reconstructed fundus retina image by using the artificial segmented retina outer contour mask, the reconstructed blood vessel tree image, the artificial segmented blood vessel tree image, and the original fundus retina image as the input of the generator;
a fundus retina image discrimination unit, configured to use the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificially segmented blood vessel tree image, and the original fundus retina image as input of a second discriminator to perform discrimination;
a generation countermeasure network loss function establishing unit for constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates the confrontation loss function of the second arbiter in the confrontation network for the improved conditional; the generator global consistency loss function is a global consistency loss function of the generator in the countermeasure network generated by the improved conditional expression;
a synthetic model total loss function establishing unit, configured to obtain a synthetic model total loss function according to the encoder reconstruction loss function, the first discriminator countermeasure loss function, the second discriminator countermeasure loss function, and the generator global consistency loss function;
an optimal model determining unit, configured to perform joint training on the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation manner according to the total loss function of the synthetic model, so that parameters in the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal vessel tree image generator and an optimal fundus retina image generator; the optimal vessel tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator generates a countermeasure network for a trained improved conditional expression.
Compared with the prior art, the invention has the beneficial effects that:
the invention has proposed combining and antagonizing the self-encoder and generating the picture synthetic method and system of the network of antagonism, this method constructs the enhancement mode including encoder, first arbiter and a pack of demoders of two packs of different categories of encoders at first, construct the after-improving conditional expression including generator and second arbiter and generate the network of antagonism; then, the artificially segmented blood vessel tree image and the original fundus retina image are used as training data, and the constructed model is subjected to iterative training to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator; and finally, obtaining synthetic sample data with higher precision and more diversified styles based on the optimal model. By adopting the method or the system, sample data amplification can be carried out on a limited medical image data set, a medical image database is enriched, the stability of a training process of a synthetic network can be improved, the precision and the generalization performance of image synthesis are improved, and the precision and the generalization performance of medical technologies such as medical image segmentation, medical image registration and medical image identification are further improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of an image synthesis method incorporating a countermeasure autoencoder and generating a countermeasure network in accordance with an embodiment of the present invention;
FIG. 2 is a block diagram of an enhanced countermeasure autoencoder in accordance with an embodiment of the present invention;
FIG. 3 is a block diagram of an encoder in the enhanced countermeasure auto-encoder according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a first discriminator of an enhanced countermeasure auto-encoder according to an embodiment of the present invention;
FIG. 5 is a block diagram of a decoder in an enhanced countermeasure auto-encoder according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a generator in a conditionally generated countermeasure network after modification according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a second discriminator in the conditionally generated countermeasure network according to an embodiment of the present invention;
FIG. 8 is a block diagram of an image composition model incorporating an enhanced countermeasure autoencoder and an improved conditional generation countermeasure network in accordance with an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an image synthesis system combining a countermeasure self-encoder and a generation countermeasure network according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
FIG. 1 is a flow chart of an image synthesis method incorporating a countermeasure autoencoder and a generation countermeasure network according to an embodiment of the present invention. Referring to fig. 1, the image synthesis method combining a countermeasure autoencoder and a countermeasure network of the present embodiment includes:
step 101: and acquiring an artificial segmentation blood vessel tree image and an original fundus retina image to obtain training data.
The artificial segmented blood vessel tree image corresponds to the original fundus retina image.
Step 102: constructing an enhanced countermeasure autoencoder; the enhanced countermeasure autoencoder includes two different sets of encoders of different classes, two different sets of first discriminators of different classes, and a set of decoders.
The two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for segmenting the blood vessel tree image according to input manual work to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; and the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image. The specific structure of the enhanced countermeasure automatic encoder is shown in fig. 2.
Step 103: constructing an improved conditional generation countermeasure network; the improved conditional generation countermeasure network includes a generator and a second discriminator.
Wherein the generator is used for generating a reconstructed fundus retina image according to the artificial segmentation retina outline mask and the reconstructed blood vessel tree image output by the structure enhanced countermeasure automatic encoder; and the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training.
Step 104: and taking the artificially segmented blood vessel tree image and the original fundus retina image as training data, and performing iterative training on the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator.
Step 105: and performing fundus retina image synthesis on the to-be-processed artificial segmentation blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthesized image.
Step 102 is a process of reconstructing the blood vessel tree image in this embodiment, and is performed under a deep learning tensflo framework. The main purposes of the vessel tree image reconstruction are: in order to obtain reconstructed vessel tree images with more diversified characteristics, content coding vectors and style coding vectors in the artificially segmented vessel tree images are extracted through encoders of different types, and the coding vectors of the two types are recombined. Target of vessel tree image reconstruction: and amplifying the blood vessel tree image data set to prepare for synthesizing the fundus retina image at the later stage. The step 102 specifically rebuilds as follows:
11) a content encoder and a genre encoder are constructed.
The content encoder comprises a first down-sampling layer and a first residual block network which are connected in sequence; the first down-sampling layer comprises a convolution layer and a first convolution block; the first network of residual blocks comprises a plurality of first standard residual blocks; the convolution layer comprises a convolution kernel and an activation function layer; the first convolution block includes a convolution kernel, an instance normalization layer, and an activation function layer; the first standard residual block includes two of the first convolution blocks connected in a jump. The style encoder comprises a second down-sampling layer, a global average pooling layer and a full-connection layer which are connected in sequence; the second downsampling layer includes a plurality of the convolutional layers.
In order to fully extract the characteristic information in the artificially segmented blood vessel tree image, the content encoder and the style encoder in the embodiment respectively adopt different structures of convolutional neural networks; the content encoder adopts a standard residual error block structure and is used for acquiring a content coding vector in the artificially segmented blood vessel tree image, the content coding vector has domain invariance and is a carrier for connecting the artificially segmented blood vessel tree image and the reconstructed blood vessel tree image to determine the structural similarity between the artificially segmented blood vessel tree image and the reconstructed blood vessel tree image; the style encoder adopts a combined structure of convolution blocks, global tie pooling and full connection, is used for obtaining style encoding vectors in the manually segmented blood vessel tree image, has heterofield uniqueness, is a guide factor for enriching the diversity of the reconstructed blood vessel tree image, and enriches the characteristics of the reconstructed blood vessel tree image.
The structure of the content encoder and the genre encoder is shown in fig. 3:
specifically, the content encoder employed includes: sequentially connecting an artificially segmented vessel tree image, a convolution layer, two convolution blocks and four standard residual blocks (each residual block comprises two convolution blocks); the first convolutional layer uses 64 convolution kernels 7 by 7 to connect the input images and uses a step size of 1, and the convolutional layer is composed of convolution kernels and an activation function; the step pitch of the first volume block and the step pitch of the second volume block are both 2, convolution kernels with the size of 4 x 4 are adopted, and the corresponding channel numbers are 128 and 256 respectively; the four standard residual blocks are identical in structure, and only the first standard residual block is shown in fig. 3, which contains two convolution blocks, each consisting of a convolution kernel, an instance normalization and activation function, and using 256 3 × 3 convolution kernels, with a step size of 2.
Specifically, the adopted style encoder comprises: sequentially connecting the artificially segmented blood vessel tree image, five convolution layers, a global average pooling layer and a full-connection layer; the first convolutional layer is the same as the first convolutional layer in the content encoder; the second convolution layer to the fourth convolution layer adopt convolution kernels with the size of 4 x 4 and the step pitch of 2, and the corresponding channel numbers are 128, 256 and 256 in sequence; the convolution kernel size of the global average pooling layer is 2 x 2, and the step size is 2; the number of channels of the fully connected layer is 8.
12) And constructing a content coding discriminator and a format coding discriminator. The content encoding discriminator includes a plurality of the first convolution blocks connected in sequence. The style coding discriminator comprises a plurality of second volume blocks which are connected in sequence.
In order to carry out discrimination constraint on the precision of the content coding vectors and the style coding vectors acquired by the encoders of different types, the precision of the extracted coding vectors of different types is optimal; in the embodiment, a content coding discriminator and a trellis coding discriminator are constructed, the structure of two discriminators of different types is the same, and the structure of a rolling block is adopted; and respectively taking the extracted content coding vector and style coding vector and corresponding artificially acquired prior content coding vector and artificially acquired prior style coding vector as the input of a content coding discriminator and a format coding discriminator, discriminating the difference degree between the content coding discriminator and the format coding discriminator, and outputting an evaluation feedback signal to perform parameter optimization on encoders of different types so as to optimize the encoders.
The structure of the content encoding discriminator and the trellis encoding discriminator is shown in fig. 4:
specifically, the adopted content coding discriminator and the adopted trellis coding discriminator have the same structure and comprise: sequentially connecting a content coding vector (style coding vector), a manually acquired prior content coding vector (manually acquired prior style coding vector) and four volume blocks; the four convolution blocks are similar in structure, convolution kernels with the size of 4 x 4 and the step pitch of 2 are adopted, and the corresponding channel numbers are 64, 128, 256 and 512 respectively; each convolution block contains a convolution kernel, an instance normalization and an activation function.
13) Constructing a decoder; the decoder comprises a second residual block network, a multilayer perceptron and an upsampling layer; the second network of residual blocks comprises a plurality of first standard residual blocks; the first standard residual block comprises two second convolution blocks connected in a jumping-over manner; said second convolution block includes a convolution layer, an adaptive instance normalization layer, and an activation function layer; the multilayer perceptron is used for outputting self-adaptive example normalized style parameters corresponding to the style coding vectors; the up-sampling layer comprises an anti-convolution layer and an activation function layer; the number of upsampling layers matches the sum of the number of first downsampling layers and the number of second downsampling layers.
In order to recombine different types of coding vectors and obtain reconstructed vessel tree images with various characteristics, the present embodiment combines the same standard residual block structure and the multilayer perceptron as those in the content encoder to construct a decoder structure; the multilayer perceptron is used for extracting self-adaptive example normalized style parameters in the style coding vector and enriching the diversity of the reconstructed blood vessel tree image as a guide factor; the combination of the four standard residual blocks and the three convolutional layers is used for generating a reconstructed vessel tree image.
The structure of the decoder is shown in fig. 5:
specifically, the decoder structure employed includes: the content coding vector is connected with four standard residual blocks, two up-sampling layers and a convolution layer; the style coding vector is connected with the multilayer perceptron and then combined with the output results of the four residual blocks; the structure of the four standard residual blocks is the same as that in the content encoder; each standard residual block comprises two convolution blocks, and each convolution block consists of a convolution kernel, an adaptive instance normalization function and an activation function; the two upper sampling layers have the same structure, the convolution kernels with the size of 5 x 5 and the step pitch of 2 are adopted, the corresponding channel numbers are 128 and 64 respectively, and each upper sampling layer consists of the convolution kernels and an activation function; the last convolution layer adopts convolution kernels with the size of 7 x 7 and the step pitch of 2, and the number of output channels is 3; and the style coding vector is used as the input of the multilayer perceptron, and the self-adaptive instance normalization parameter is obtained and used for enriching the reconstructed blood vessel tree image.
The main reasons why the standard residual block is used by both the content encoder and the decoder in the enhanced countermeasure autoencoder of the present embodiment are as follows:
(1) the problem that gradient dispersion occurs due to the fact that the number of layers of the training model is deeper can be well relieved, and network weight parameters of the shallow part can be better trained.
(2) The jump connection in the standard residual block can effectively fuse the characteristic information between the low layer and the high layer, so that the gradient information can smoothly pass through each residual block.
Here, step 103 is a synthesizing process of the fundus retinal image in the present embodiment, and is also performed in a tensoflow frame for deep learning. The main purposes of fundus retinal image synthesis are: in order to acquire more diversified reconstructed fundus retinal images structurally similar to the original fundus retinal image. Target of fundus retinal image synthesis: a limited fundus retinal training data set is expanded. The specific synthesis in step 103 is as follows:
21) constructing a generator; the generator comprises a convolution layer, a deconvolution layer and a channel-by-channel full connection layer, and the connection mode of the generator is jump connection; the number of the convolution layers is matched with that of the deconvolution layers; the convolution layer comprises a convolution kernel and an activation function layer; the deconvolution layer includes a deconvolution kernel and an activation function layer.
In order to generate reconstructed fundus retina images which are various in characteristics, high in fidelity and similar to original fundus retina images in structure, the generator network of the embodiment adopts a convolution neural network structure similar to U-Net; as the depth of the model is increased, part of effective characteristic information in the reconstructed image is lost, and the lost part of effective information is compensated by adding jump connection and fusing the characteristic information of the low layer and the high layer; the fundus retina outline mask is added to the improved conditional generation countermeasure network, and fundus retina images can be acquired more efficiently.
The structure of the generator is shown in fig. 6:
specifically, the generator network structure adopted includes: sequentially connecting the fundus retina outline mask with the reconstructed blood vessel tree image, four convolutional layers, a channel-by-channel full-connection layer and four deconvolution layers. The first convolution layer adopts convolution kernels with the size of 7 × 7 and the step pitch of 2, the second convolution layer and the third convolution layer both adopt convolution kernels with the size of 5 × 5 and the step pitch of 2, the fourth convolution layer adopts convolution kernels with the size of 3 × 3 and the step pitch of 2, and the number of channels corresponding to the four convolution layers is 64, 128, 256 and 512 respectively; the number of channels of the channel-by-channel full connection layer is 512; the four deconvolution layers and the four convolution layers are symmetrical in structure, parameters of the first deconvolution layer are the same as those of the fourth convolution layer, parameters of the second deconvolution layer and the third deconvolution layer are the same as those of the third convolution layer and the second convolution layer, and parameter settings of the fourth deconvolution layer and the first convolution layer are the same.
22) Constructing a second discriminator; the second discriminator comprises a convolution layer, a full connection layer and a binary layer corresponding to the discriminator; and the convolution layer corresponding to the discriminator comprises a combined batch normalization layer and an activation function layer. And the two classification layers are used for distinguishing the real image from the generated image according to the output characteristics of the full connection layer.
In order to ensure that the reconstructed fundus retinal image is more real in vision and is similar to the original fundus retinal image in structure, a two-classification discriminator network is constructed for discriminating whether the input image is the original fundus retinal image or the reconstructed fundus retinal image.
The structure of the discriminator is shown in FIG. 7:
specifically, the second discriminator includes: sequentially connecting the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificially segmented blood vessel tree image, the original fundus retina image, four convolution layers, a full connection layer and a binary classification layer; the four convolution layers have the same structure, the convolution kernels with the size of 5 x 5 and the step pitch of 2 are adopted, and the corresponding channel numbers are 64, 128, 256 and 512 respectively; the number of channels of the full connection layer is 512; the number of channels in the second category is 1024.
The reason why the generator network in the improved conditional generation countermeasure network of the embodiment adopts a symmetrical structure similar to "U-Net" is as follows:
(1) the distribution of the symmetrical structure can effectively fuse the characteristic information between the low layer and the high layer by adding jump connection, and make up the problem of effective information loss caused by deeper model network layers.
(2) The U-Net network has unique structure and strong adaptability, and can acquire a model with better performance by training on a limited data set.
In step 104, an iterative training process is performed on the combination enhanced confrontation automatic encoder and the improved conditional generation confrontation network synthesis model based on the training data set of the artificial segmentation blood vessel tree image and the original fundus retina image in the embodiment. The specific iterative training step in step 104 is as follows:
31) acquiring an artificially segmented blood vessel tree image and an original fundus retina image to obtain training data; the artificial segmented blood vessel tree image corresponds to the original fundus retina image.
32) And respectively taking the manually segmented vessel tree image as the input of the content encoder and the style encoder to obtain a content coding vector and a style coding vector, and taking the content coding vector and the style coding vector as the input of the decoder to obtain a reconstructed vessel tree image.
33) And taking the content coding vector and the prior content coding vector as the input of the content coding discriminator to distinguish, and taking the style coding vector and the prior style coding vector as the input of the style coding discriminator to distinguish.
34) Constructing an encoder reconstruction loss function and a first discriminator countervailing loss function according to the artificially segmented blood vessel tree image and the reconstructed blood vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is the confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder.
35) And generating a reconstructed fundus retina image by taking the artificial segmented retina outer contour mask, the reconstructed blood vessel tree image, the artificial segmented blood vessel tree image and the original fundus retina image as the input of the generator.
36) And taking the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificial segmentation blood vessel tree image and the original fundus retina image as the input of a second discriminator for judgment.
37) Constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates a confrontation loss function for the second arbiter in the confrontation network for the improved conditional; the producer global consistency loss function generates a global consistency loss function for producers in the countermeasure network for the improved conditional.
38) And obtaining a total loss function of a synthetic model according to the reconstruction loss function of the encoder, the countermeasure loss function of the first discriminator, the countermeasure loss function of the second discriminator and the global consistency loss function of the generator. The method specifically comprises the following steps:
obtaining a total loss function of the confrontation automatic encoder according to the encoder reconstruction loss function and the first discriminator confrontation loss function; obtaining a total loss function of the generated countermeasure network according to the countermeasure loss function of the second discriminator and the global consistency loss function of the generator; and linearly adding the total loss function of the countermeasure automatic encoder and the total loss function of the generated countermeasure network to obtain a total loss function of the synthetic model.
39) Performing combined training on the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation mode according to the total loss function of the synthetic model, so that parameters in the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator; the optimal vascular tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator is a trained improved conditional generation countermeasure network; the optimal blood vessel tree image generator and the optimal fundus retinal image generator constitute a final overall composite model, which is shown in fig. 8.
In the step, in the process of continuous updating and optimization, the whole synthesis model can generate a series of reconstructed blood vessel tree images and reconstructed fundus retina images, the discriminator network determines the structural proximity degree of the reconstructed fundus retina images and the original fundus retina images by learning the reconstructed fundus retina images and the original fundus retina images, and a feedback signal for synthesis quality evaluation is output; the blood vessel tree image generator network and the fundus retina image generator network can optimize network parameters according to the feedback signals, so that reconstructed fundus retina images with more diversified characteristics can be obtained, and finally, the optimal blood vessel tree image generator network and the optimal fundus retina image generator network are obtained.
Wherein the countering auto-encoder total loss function in step 38) is
Figure BDA0002408608680000171
Wherein,
Figure BDA0002408608680000172
to combat the auto-encoder total loss function;
Figure BDA0002408608680000173
in order to be a content encoding discriminator,
Figure BDA0002408608680000174
as a style code discriminator, Q c Being a content encoder, Q s Is a style encoder, and R is a decoder;
Figure BDA0002408608680000175
for the corresponding countermeasures of the content encoding discriminator,
Figure BDA0002408608680000176
the corresponding countermeasure loss of the style coding discriminator; l is Recon Reconstructing losses for the encoder; the first discriminator includes
Figure BDA0002408608680000177
And
Figure BDA0002408608680000178
λ 0 for countering the weight parameter, λ, corresponding to the total loss function of the automatic encoder 0 For balancing the countermeasure and reconstruction losses.
Figure BDA0002408608680000179
Wherein v represents an artificially segmented vessel tree image;
Figure BDA00024086086800001710
representing the expected value, P, corresponding to the image of the artificially segmented vessel tree data (v) Data distribution representing an image of an artificially segmented vessel tree; q c (v) content encoder with input v, Q s And (v) is a style encoder which is input to artificially segment the blood vessel tree image.
Figure BDA00024086086800001711
Figure BDA00024086086800001712
Wherein z is c Representing a content encoding vector, z s Representing a stylized code vector, P (z) c ) Representing a prior distribution, P (z), of artificially added content latent variables s ) Representing the prior distribution of the artificially added style latent variable;
Figure BDA00024086086800001713
indicating the expected value to which the content encoding vector corresponds,
Figure BDA00024086086800001714
representing expected values corresponding to the style code vectors; q c (z c | v) represents a content encoding distribution function, Q s (z s | v) represents a stylistic code distribution function, Q c (z c | v) for obtaining a content-encoded vector, Q s (z s | v) is used for obtaining the style encoding vector;
Figure BDA00024086086800001715
is input as z c The content encoding discriminator of (1),
Figure BDA00024086086800001716
is input as z s The style code discriminator of (1);
Figure BDA00024086086800001717
as an input of Q c (z c | v) content encoding discriminator,
Figure BDA00024086086800001718
as an input of Q s (z s | v) style encoding discriminator.
Said generating in step 38) opposes a network total loss function of
Figure BDA0002408608680000181
Wherein L is im2im (G, D) generating an antagonistic network total loss function; g represents a generator in the improved conditional generation countermeasure network, and D represents a second discriminator in the improved conditional generation countermeasure network; l is adv (G, D) represents a second discriminator opposition loss function;
Figure BDA0002408608680000182
representing a generator global consistency loss function;
Figure BDA0002408608680000183
representing expected values corresponding to the artificial segmentation blood vessel tree image and the original fundus retina image; v represents an artificial segmentation blood vessel tree image, r represents an original fundus retina image, v represents a reconstructed blood vessel tree image, and m is a retina outline mask; lambda [ alpha ] 1 To generate a weight parameter, λ, corresponding to the overall loss function of the reactive network 1 The global consistency loss of the generator and the countermeasure loss of the second discriminator are balanced;
Figure BDA0002408608680000184
representing the generators with inputs v and m.
Figure BDA0002408608680000185
Wherein,
Figure BDA0002408608680000186
representing the expected value corresponding to the reconstructed blood vessel tree image; d (m, (v, r)) represents a second discriminator with inputs m, v, and r;
Figure BDA0002408608680000187
the input is m,
Figure BDA0002408608680000188
And
Figure BDA0002408608680000189
the second discriminator of (2);
Figure BDA00024086086800001810
representing inputs as m and
Figure BDA00024086086800001811
the generator of (1).
The total loss function of the synthetic model in step 38) is
Figure BDA00024086086800001812
Wherein,
Figure BDA00024086086800001813
representing the composite model total loss function.
The image synthesis method combining the countermeasure self-encoder and the countermeasure network generation in the implementation can perform sample data amplification on a limited medical image data set, enrich a medical image database, improve the stability of a training process of a synthesis network, and improve the precision and generalization performance of image synthesis, thereby promoting the precision and generalization performance improvement of medical technologies such as medical image segmentation, medical image registration and medical image recognition.
In addition, the embodiment can directly give quantitative evaluation parameters by constructing the countermeasure loss function and the global consistency loss function in the improved conditional generation countermeasure network, and determine the structural closeness degree of the reconstructed fundus retina image and the original fundus retina image.
The invention also provides an image synthesis system combining the countermeasure self-encoder and the countermeasure network generation, and fig. 9 is a schematic structural diagram of the image synthesis system combining the countermeasure self-encoder and the countermeasure network generation according to the embodiment of the invention.
Referring to fig. 9, the image synthesizing system combining the countermeasure autoencoder and the generation countermeasure network of the present embodiment includes:
a training data obtaining module 201, configured to obtain an artificial segmented blood vessel tree image and an original fundus retina image to obtain training data.
An enhanced countermeasure autoencoder construction module 202 for constructing an enhanced countermeasure autoencoder; the enhanced countermeasure automatic encoder comprises two groups of encoders of different classes, two groups of first discriminators of different classes and a group of decoders; the two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for segmenting the blood vessel tree image according to input manual work to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; and the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image.
A modified conditional generation countermeasure network construction module 203 for constructing a modified conditional generation countermeasure network; the improved conditional generation countermeasure network comprises a generator and a second discriminator; the generator is used for generating a reconstructed fundus retina image according to the artificial segmented retina outline mask and the reconstructed blood vessel tree image output by the construction enhanced countermeasure automatic encoder; and the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training.
And the model training module 204 is configured to perform iterative training on the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network by using the artificially segmented blood vessel tree image and the original fundus retina image as training data to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator.
And a composite image determining module 205, configured to perform fundus retina image synthesis on the to-be-processed artificially segmented blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator, so as to obtain a composite image.
As an optional implementation, the enhanced countermeasure autoencoder constructing module 202 specifically includes:
a content encoder constructing unit for constructing a content encoder; the content encoder comprises a first down-sampling layer and a first residual block network which are connected in sequence; the first down-sampling layer comprises a convolution layer and a first convolution block; the first network of residual blocks comprises a plurality of first standard residual blocks; the convolution layer comprises a convolution kernel and an activation function layer; the first convolution block comprises a convolution kernel, an instance normalization layer and an activation function layer; the first standard residual block includes two of the first convolution blocks connected in a jump.
A style encoder constructing unit for constructing a style encoder; the style encoder comprises a second down-sampling layer, a global average pooling layer and a full-connection layer which are connected in sequence; the second downsampling layer includes a plurality of the convolutional layers.
A content encoding discriminator constructing unit for constructing a content encoding discriminator; the content encoding discriminator includes a plurality of the first convolution blocks connected in sequence.
A style code discriminator construction unit for constructing a style code discriminator; the style coding discriminator comprises a plurality of second volume blocks which are connected in sequence.
A decoder construction unit for constructing a decoder; the decoder comprises a second residual block network, a multilayer perceptron and an upsampling layer; the second network of residual blocks comprises a plurality of first standard residual blocks; the first standard residual block comprises two second convolution blocks connected in a jumping-over manner; said second convolution block includes a convolution layer, an adaptive instance normalization layer, and an activation function layer; the multilayer perceptron is used for outputting self-adaptive example normalized style parameters corresponding to the style coding vectors; the up-sampling layer comprises an anti-convolution layer and an activation function layer; the number of upsampling layers matches the sum of the number of first downsampling layers and the number of second downsampling layers.
A generator constructing unit for constructing a generator; the generator comprises a convolution layer, a deconvolution layer and a channel-by-channel full connection layer, and the connection mode of the generator is jump connection; the number of the convolution layers is matched with that of the deconvolution layers; the convolution layer comprises a convolution kernel and an activation function layer; the deconvolution layer includes a deconvolution kernel and an activation function layer.
A second discriminator construction unit for constructing a second discriminator; the second discriminator comprises a convolution layer, a full connection layer and a binary layer corresponding to the discriminator; and the convolution layer corresponding to the discriminator comprises a combined batch normalization layer and an activation function layer.
As an optional implementation manner, the model training module 204 specifically includes:
the training data acquisition unit is used for acquiring an artificial segmented blood vessel tree image and an original fundus retina image to obtain training data; the artificially segmented blood vessel tree image corresponds to the original fundus retina image.
And the vessel tree image reconstruction unit is used for respectively taking the artificially segmented vessel tree image as the input of the content encoder and the style encoder to obtain a content coding vector and a style coding vector, and taking the content coding vector and the style coding vector as the input of the decoder to obtain a reconstructed vessel tree image.
And an encoding judgment unit configured to perform discrimination by using the content encoding vector and the prior content encoding vector as inputs to the content encoding discriminator, and perform discrimination by using the genre encoding vector and the prior genre encoding vector as inputs to the genre encoding discriminator.
An enhanced countermeasure automatic encoder loss function establishing unit, configured to construct an encoder reconstruction loss function and a first discriminator countermeasure loss function according to the artificially segmented vessel tree image and the reconstructed vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is the confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder.
And the fundus retina image reconstruction unit is used for generating a reconstructed fundus retina image by taking the artificial segmentation retina outline mask, the reconstructed blood vessel tree image, the artificial segmentation blood vessel tree image and the original fundus retina image as the input of the generator.
And a fundus retina image discrimination unit for taking the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificially segmented blood vessel tree image and the original fundus retina image as input of a second discriminator to perform discrimination.
A generation countermeasure network loss function establishing unit for constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates the confrontation loss function of the second arbiter in the confrontation network for the improved conditional; the producer global consistency loss function generates a global consistency loss function for producers in the countermeasure network for the improved conditional.
And the synthetic model total loss function establishing unit is used for obtaining a synthetic model total loss function according to the encoder reconstruction loss function, the first discriminator countermeasure loss function, the second discriminator countermeasure loss function and the generator global consistency loss function.
An optimal model determining unit, configured to perform joint training on the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation manner according to the total loss function of the synthetic model, so that parameters in the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal vessel tree image generator and an optimal fundus retina image generator; the optimal vascular tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator is a trained improved conditional generation countermeasure network.
As an optional implementation manner, the synthetic model total loss function establishing unit specifically includes:
a first total loss function determining subunit, configured to obtain a total loss function of the automatic counter encoder according to the encoder reconstruction loss function and the first discriminator counter loss function;
a second total loss function determining subunit, configured to obtain a generated confrontation network total loss function according to the second discriminator confrontation loss function and the generator global consistency loss function;
and the third total loss function determining subunit is used for performing linear summation on the total loss function of the countermeasure automatic encoder and the total loss function of the generated countermeasure network to obtain a total loss function of the synthetic model.
As an alternative embodiment, the first total loss function determines the total loss function of the countering automatic encoder in the sub-unit as
Figure BDA0002408608680000221
Wherein,
Figure BDA0002408608680000222
to combat the auto-encoder total loss function;
Figure BDA0002408608680000223
in order to be a content encoding discriminator,
Figure BDA0002408608680000224
as a style code discriminator, Q c Being a content encoder, Q s Is a style encoder, and R is a decoder;
Figure BDA0002408608680000225
for the corresponding countermeasures of the content encoding discriminator,
Figure BDA0002408608680000226
the corresponding countermeasure loss of the style coding discriminator; l is Recon Reconstructing losses for the encoder; the first discriminator includes
Figure BDA0002408608680000227
And
Figure BDA0002408608680000228
λ 0 for countering the weight parameter, λ, corresponding to the total loss function of the automatic encoder 0 For balancing countermeasure and reconstruction losses;
Figure BDA0002408608680000229
wherein v represents an artificially segmented vessel tree image;
Figure BDA00024086086800002210
representing the expected value, P, corresponding to the image of the artificially segmented vessel tree data (v) Data distribution representing an artificially segmented vessel tree image; q c (v) content encoder with input v, Q s (v) a style encoder for inputting an image of the artificially segmented vessel tree;
Figure BDA00024086086800002211
Figure BDA00024086086800002212
wherein z is c Representing content-encoded vectors, z s Representing a stylized code vector, P (z) c ) Representing a prior distribution, P (z), of artificially added content latent variables s ) Representing the prior distribution of the artificially added style latent variable;
Figure BDA00024086086800002213
indicating the expected value to which the content encoding vector corresponds,
Figure BDA00024086086800002214
representing expected values corresponding to the style code vectors; q c (z c | v) represents a content encoding distribution function, Q s (z s | v) represents a stylistic code distribution function, Q c (z c | v) for obtaining content encoding vectors,Q s (z s | v) is used for obtaining the style encoding vector;
Figure BDA0002408608680000231
is input as z c The content encoding discriminator of (1),
Figure BDA0002408608680000232
is input as z s The style code discriminator of (1);
Figure BDA0002408608680000233
as an input of Q c (z c | v) content encoding discriminator,
Figure BDA0002408608680000234
as an input of Q s (z s | v) style encoding discriminator.
As an optional implementation manner, the second total loss function determining subunit determines the total loss function of the generation countermeasure network as
Figure BDA0002408608680000235
Wherein L is im2im (G, D) generating an antagonistic network total loss function; g represents a generator in the improved conditional generation countermeasure network, and D represents a second discriminator in the improved conditional generation countermeasure network; l is adv (G, D) represents a second discriminator opposition loss function;
Figure BDA0002408608680000236
representing a generator global consistency loss function;
Figure BDA0002408608680000237
representing expected values corresponding to the artificial segmentation blood vessel tree image and the original fundus retina image; v denotes an artificially segmented blood vessel tree image, r denotes an original fundus retina image,
Figure BDA0002408608680000238
representing a reconstructed blood vessel tree image, wherein m is a retina outline mask; lambda [ alpha ] 1 To generate a weight parameter, λ, corresponding to the overall loss function of the reactive network 1 The global consistency loss of the generator and the countermeasure loss of the second discriminator are balanced;
Figure BDA0002408608680000239
a generator representing inputs v and m;
Figure BDA00024086086800002310
wherein,
Figure BDA00024086086800002311
representing the expected value corresponding to the reconstructed blood vessel tree image; d (m, (v, r)) represents a second discriminator with inputs m, v, and r;
Figure BDA00024086086800002312
the input is m,
Figure BDA00024086086800002313
And
Figure BDA00024086086800002314
the second discriminator of (1);
Figure BDA00024086086800002315
representing inputs as m and
Figure BDA00024086086800002316
the generator of (1).
As an alternative, the third total loss function determines the total loss function of the synthesis model in the subunit as
Figure BDA00024086086800002317
Wherein,
Figure BDA00024086086800002318
representing the composite model total loss function.
The image synthesis system combining the countermeasure autoencoder and the generation countermeasure network in the embodiment can perform sample data amplification on a limited medical image data set, enrich a medical image database, improve the stability of the training process of the synthesis network, improve the precision and generalization performance of image synthesis, and further promote the precision and generalization performance improvement of medical technologies such as medical image segmentation, medical image registration and medical image recognition.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principle and the embodiment of the present invention are explained by applying specific examples, and the above description of the embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (8)

1. An image synthesis method incorporating a challenge autocoder and generating a challenge network, comprising:
constructing an enhanced countermeasure autoencoder; the enhanced countermeasure automatic encoder comprises two groups of encoders of different classes, two groups of first discriminators of different classes and a group of decoders; the two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for segmenting the blood vessel tree image according to input manual work to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image;
constructing an improved conditional generation countermeasure network; the improved conditional generation countermeasure network comprises a generator and a second discriminator; the generator is used for generating a reconstructed fundus retina image according to the artificial segmentation retina outline mask and the reconstructed blood vessel tree image output by the enhanced countermeasure automatic encoder; the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training;
taking the artificially segmented blood vessel tree image and an original fundus retina image as training data, and performing iterative training on the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator;
performing fundus retina image synthesis on the to-be-processed artificially segmented blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthesized image;
the step of performing iterative training on the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network by using the artificially segmented blood vessel tree image and the original fundus retinal image as training data to obtain an optimal blood vessel tree image generator and an optimal fundus retinal image generator specifically includes:
acquiring an artificial segmentation blood vessel tree image and an original fundus retina image to obtain training data; the artificial segmented blood vessel tree image corresponds to the original fundus retina image;
respectively taking the manually segmented vessel tree image as the input of the content encoder and the style encoder to obtain a content coding vector and a style coding vector, and taking the content coding vector and the style coding vector as the input of the decoder to obtain a reconstructed vessel tree image;
taking the content coding vector and the prior content coding vector as the input of the content coding discriminator to distinguish, and taking the style coding vector and the prior style coding vector as the input of the style coding discriminator to distinguish;
constructing an encoder reconstruction loss function and a first discriminator countervailing loss function according to the artificially segmented blood vessel tree image and the reconstructed blood vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is a confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder;
using the artificial segmented retina outer contour mask, the reconstructed blood vessel tree image, the artificial segmented blood vessel tree image and the original fundus retina image as the input of the generator to generate a reconstructed fundus retina image;
taking the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificial segmented blood vessel tree image and the original fundus retina image as input of a second discriminator to carry out judgment;
constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates the confrontation loss function of the second arbiter in the confrontation network for the improved conditional; the generator global consistency loss function is a global consistency loss function of the generator in the countermeasure network generated by the improved conditional expression;
obtaining a total loss function of a synthetic model according to the reconstruction loss function of the encoder, the countermeasure loss function of the first discriminator, the countermeasure loss function of the second discriminator and the global consistency loss function of the generator;
performing combined training on the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation mode according to the total loss function of the synthetic model, so that parameters in the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator; the optimal vascular tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator is a trained improved conditional generation countermeasure network.
2. The method for image synthesis combining confrontation autoencoder and generation of confrontation network according to claim 1, characterized in that said construction of enhanced confrontation autoencoder includes:
constructing a content encoder; the content encoder comprises a first down-sampling layer and a first residual block network which are connected in sequence; the first down-sampling layer comprises a convolution layer and a first convolution block; the first network of residual blocks comprises a plurality of first standard residual blocks; the convolution layer comprises a convolution kernel and an activation function layer; the first convolution block comprises a convolution kernel, an instance normalization layer and an activation function layer; the first standard residual block comprises two first convolution blocks connected in a jumping-over manner;
constructing a style encoder; the style encoder comprises a second down-sampling layer, a global average pooling layer and a full-connection layer which are connected in sequence; the second downsampling layer includes a plurality of the convolutional layers;
constructing a content coding discriminator; the content coding discriminator comprises a plurality of first convolution blocks which are connected in sequence;
constructing a style code discriminator; the style coding discriminator comprises a plurality of second volume blocks which are connected in sequence;
constructing a decoder; the decoder comprises a second residual block network, a multilayer perceptron and an upsampling layer; the second network of residual blocks comprises a plurality of first standard residual blocks; the first standard residual block comprises two second convolution blocks connected in a jumping-over manner; said second convolution block includes a convolution layer, an adaptive instance normalization layer, and an activation function layer; the multilayer perceptron is used for outputting self-adaptive example normalized style parameters corresponding to the style coding vectors; the up-sampling layer comprises an anti-convolution layer and an activation function layer; the number of upsampling layers matches the sum of the number of first downsampling layers and the number of second downsampling layers.
3. The image synthesis method combining a competing autoencoder and a competing network according to claim 1, wherein the constructing a modified conditional competing network comprises:
constructing a generator; the generator comprises a convolution layer, a deconvolution layer and a channel-by-channel full connection layer, and the connection mode of the generator is jump connection; the number of the convolution layers is matched with that of the deconvolution layers; the convolution layer comprises a convolution kernel and an activation function layer; the deconvolution layer comprises a deconvolution kernel and an activation function layer;
constructing a second discriminator; the second discriminator comprises a convolution layer, a full connection layer and a binary layer corresponding to the discriminator; and the convolution layer corresponding to the discriminator comprises a combined batch normalization layer and an activation function layer.
4. The image synthesis method combining a countermeasure self-encoder and a countermeasure network generation according to claim 1, wherein the obtaining a synthesis model total loss function from the encoder reconstruction loss function, the first discriminator countermeasure loss function, the second discriminator countermeasure loss function, and the generator global coherence loss function specifically includes:
obtaining a total loss function of the confrontation automatic encoder according to the encoder reconstruction loss function and the first discriminator confrontation loss function;
obtaining a total loss function of the generated countermeasure network according to the countermeasure loss function of the second discriminator and the global consistency loss function of the generator;
and linearly adding the total loss function of the countermeasure automatic encoder and the total loss function of the generated countermeasure network to obtain a total loss function of the synthetic model.
5. The method of image synthesis in combination with a countering autoencoder and a generation countering network according to claim 4, characterized in that the countering autoencoder total loss function is
Figure FDA0003683279460000041
Wherein,
Figure FDA0003683279460000042
to combat the auto-encoder total loss function;
Figure FDA0003683279460000043
in order to be a content encoding discriminator,
Figure FDA0003683279460000044
as a style code discriminator, Q c Being a content encoder, Q s Is a style encoder, and R is a decoder;
Figure FDA0003683279460000045
for the corresponding countermeasures of the content encoding discriminator,
Figure FDA0003683279460000046
the corresponding countermeasure loss of the style coding discriminator; l is Recon Reconstructing losses for the encoder; the first discriminator includes
Figure FDA0003683279460000047
And
Figure FDA0003683279460000048
λ 0 for countering the weight parameter, λ, corresponding to the total loss function of the automatic encoder 0 For balancing countermeasure and reconstruction losses;
Figure FDA0003683279460000049
wherein v represents an artificially segmented vessel tree image;
Figure FDA00036832794600000410
representing the expected value, P, corresponding to the image of the artificially segmented vessel tree data (v) Data distribution representing an image of an artificially segmented vessel tree; q c (v) content encoder with input v, Q s (v) a style encoder for inputting an image of the artificially segmented vessel tree;
Figure FDA0003683279460000051
Figure FDA0003683279460000052
wherein z is c Representing content-encoded vectors, z s Representing a stylized code vector, P (z) c ) Representing a prior distribution, P (z), of artificially added content latent variables s ) Representing the prior distribution of the artificially added style latent variable;
Figure FDA0003683279460000053
indicating the expected value to which the content encoding vector corresponds,
Figure FDA0003683279460000054
representing expected values corresponding to the style code vectors; q c (z c | v) represents a content encoding distribution function, Q s (z s | v) representation styleCoding distribution function, Q c (z c | v) for obtaining a content-encoded vector, Q s (z s | v) is used for obtaining the style encoding vector;
Figure FDA0003683279460000055
is input as z c The content encoding discriminator of (1),
Figure FDA0003683279460000056
is input as z s The style code discriminator of (1);
Figure FDA0003683279460000057
as an input of Q c (z c A content encoding discriminator of | v),
Figure FDA0003683279460000058
as an input of Q s (z s | v) style encoding discriminator.
6. The method of image synthesis in combination with a competing autoencoder and a competing network according to claim 5, wherein the competing network total loss function is
Figure FDA0003683279460000059
Wherein L is im2im (G, D) generating an antagonistic network total loss function; g represents a generator in the improved conditional generation countermeasure network, and D represents a second discriminator in the improved conditional generation countermeasure network; l is adv (G, D) represents a second discriminator opposition loss function;
Figure FDA00036832794600000510
representing a generator global consistency loss function;
Figure FDA00036832794600000511
representing expected values corresponding to the artificial segmentation blood vessel tree image and the original fundus retina image; v denotes an artificially segmented blood vessel tree image, r denotes an original fundus retina image,
Figure FDA00036832794600000512
representing a reconstructed blood vessel tree image, wherein m is a retina outline mask; lambda [ alpha ] 1 To generate a weight parameter, λ, corresponding to the overall loss function of the reactive network 1 The global consistency loss of the generator and the countermeasure loss of the second discriminator are balanced;
Figure FDA00036832794600000513
is expressed as input
Figure FDA00036832794600000514
And a generator of m;
Figure FDA00036832794600000515
wherein,
Figure FDA00036832794600000516
representing the expected value corresponding to the reconstructed blood vessel tree image; d (m, (v, r)) represents a second discriminator with inputs m, v, and r;
Figure FDA00036832794600000517
the input is m,
Figure FDA00036832794600000518
And
Figure FDA00036832794600000519
the second discriminator of (1);
Figure FDA00036832794600000520
representing inputs as m and
Figure FDA00036832794600000521
the generator of (1).
7. The method of image synthesis incorporating a confrontational autoencoder and generation of a confrontational network of claim 6, wherein said synthesis model total loss function is
Figure FDA0003683279460000061
Wherein,
Figure FDA0003683279460000062
representing the composite model total loss function.
8. An image synthesis system incorporating a challenge autocoder and a generation challenge network, comprising:
an enhanced countermeasure autoencoder construction module for constructing an enhanced countermeasure autoencoder; the enhanced countermeasure automatic encoder comprises two groups of encoders of different classes, two groups of first discriminators of different classes and a group of decoders; the two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for segmenting the blood vessel tree image according to input manual work to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image;
the improved conditional generation countermeasure network construction module is used for constructing an improved conditional generation countermeasure network; the improved conditional generation countermeasure network comprises a generator and a second discriminator; the generator is used for generating a reconstructed fundus retina image according to the artificial segmentation retina outline mask and the reconstructed blood vessel tree image output by the enhanced countermeasure automatic encoder; the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training;
the model training module is used for taking the artificially segmented blood vessel tree image and the original fundus retina image as training data, and performing iterative training on the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator;
the synthetic image determining module is used for carrying out fundus retina image synthesis on the to-be-processed artificially segmented blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthetic image;
the model training module specifically comprises:
the training data acquisition unit is used for acquiring an artificial segmentation blood vessel tree image and an original fundus retina image to obtain training data; the artificially segmented blood vessel tree image corresponds to the original fundus retina image;
the vessel tree image reconstruction unit is used for respectively taking the artificially segmented vessel tree image as the input of the content encoder and the style encoder to obtain a content coding vector and a style coding vector, and taking the content coding vector and the style coding vector as the input of the decoder to obtain a reconstructed vessel tree image;
a coding judgment unit configured to perform discrimination by using the content coding vector and the prior content coding vector as inputs of the content coding discriminator, and perform discrimination by using the style coding vector and the prior style coding vector as inputs of the style coding discriminator;
an enhanced countermeasure automatic encoder loss function establishing unit, configured to construct an encoder reconstruction loss function and a first discriminator countermeasure loss function according to the artificially segmented vessel tree image and the reconstructed vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is a confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder;
a fundus retina image reconstruction unit, configured to generate a reconstructed fundus retina image by using the artificial segmented retina outer contour mask, the reconstructed blood vessel tree image, the artificial segmented blood vessel tree image, and the original fundus retina image as the input of the generator;
a fundus retina image discrimination unit configured to take the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificially segmented blood vessel tree image, and the original fundus retina image as input of a second discriminator to perform discrimination;
a generation countermeasure network loss function establishing unit for constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates the confrontation loss function of the second arbiter in the confrontation network for the improved conditional; the generator global consistency loss function is a global consistency loss function of the generator in the countermeasure network generated by the improved conditional expression;
a synthetic model total loss function establishing unit, configured to obtain a synthetic model total loss function according to the encoder reconstruction loss function, the first discriminator countermeasure loss function, the second discriminator countermeasure loss function, and the generator global consistency loss function;
an optimal model determining unit, configured to perform joint training on the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation manner according to the total loss function of the synthetic model, so that parameters in the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal vessel tree image generator and an optimal fundus retina image generator; the optimal vascular tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator is a trained improved conditional generation countermeasure network.
CN202010169306.5A 2020-03-12 2020-03-12 Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network Expired - Fee Related CN111402179B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010169306.5A CN111402179B (en) 2020-03-12 2020-03-12 Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010169306.5A CN111402179B (en) 2020-03-12 2020-03-12 Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network

Publications (2)

Publication Number Publication Date
CN111402179A CN111402179A (en) 2020-07-10
CN111402179B true CN111402179B (en) 2022-08-09

Family

ID=71430747

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010169306.5A Expired - Fee Related CN111402179B (en) 2020-03-12 2020-03-12 Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network

Country Status (1)

Country Link
CN (1) CN111402179B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111833359B (en) * 2020-07-13 2022-07-12 中国海洋大学 Brain tumor segmentation data enhancement method based on generation of confrontation network
CN111862253B (en) * 2020-07-14 2023-09-15 华中师范大学 Sketch coloring method and system for generating countermeasure network based on deep convolution
CN111835983B (en) * 2020-07-23 2021-06-29 福州大学 Multi-exposure-image high-dynamic-range imaging method and system based on generation countermeasure network
CN112230210B (en) * 2020-09-09 2022-07-29 南昌航空大学 HRRP radar target identification method based on improved LSGAN and CNN
CN112435281B (en) * 2020-09-23 2022-06-24 山东师范大学 Multispectral fundus image analysis method and system based on counterstudy
CN112396674A (en) * 2020-10-21 2021-02-23 浙江工业大学 Rapid event image filling method and system based on lightweight generation countermeasure network
CN112348806B (en) * 2020-11-14 2022-08-26 四川大学华西医院 No-reference digital pathological section ambiguity evaluation method
CN112465007B (en) * 2020-11-24 2023-10-13 深圳市优必选科技股份有限公司 Training method of target recognition model, target recognition method and terminal equipment
CN112364838B (en) * 2020-12-09 2023-04-07 佛山市南海区广工大数控装备协同创新研究院 Method for improving handwriting OCR performance by utilizing synthesized online text image
US20220269937A1 (en) * 2021-02-24 2022-08-25 Nvidia Corporation Generating frames for neural simulation using one or more neural networks
CN112967251B (en) * 2021-03-03 2024-06-04 网易(杭州)网络有限公司 Picture detection method, training method and device of picture detection model
CN113096169B (en) * 2021-03-31 2022-05-20 华中科技大学 Non-rigid multimode medical image registration model establishing method and application thereof
CN113269274B (en) * 2021-06-18 2022-04-19 南昌航空大学 Zero sample identification method and system based on cycle consistency
US20230146676A1 (en) * 2021-11-05 2023-05-11 Lemon Inc. Portrait stylization framework to control the similarity between stylized portraits and original photo
CN114298206B (en) * 2021-12-24 2024-05-24 北京理工大学 Fundus image domain conversion method and fundus image domain conversion system
FR3139225A1 (en) 2022-08-23 2024-03-01 Compagnie Generale Des Etablissements Michelin Image synthesis using physical simulation and deep learning methods for early learning of a tire surface evaluation model
CN116109891B (en) * 2023-02-08 2023-07-25 人民网股份有限公司 Image data amplification method, device, computing equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107977629A (en) * 2017-12-04 2018-05-01 电子科技大学 A kind of facial image aging synthetic method of feature based separation confrontation network
CN109543745A (en) * 2018-11-20 2019-03-29 江南大学 Feature learning method and image-recognizing method based on condition confrontation autoencoder network
CN109949278A (en) * 2019-03-06 2019-06-28 西安电子科技大学 Hyperspectral abnormity detection method based on confrontation autoencoder network
CN110533620A (en) * 2019-07-19 2019-12-03 西安电子科技大学 The EO-1 hyperion and panchromatic image fusion method of space characteristics are extracted based on AAE
CN110827265A (en) * 2019-11-07 2020-02-21 南开大学 Image anomaly detection method based on deep learning
CN110852935A (en) * 2019-09-26 2020-02-28 西安交通大学 Image processing method for human face image changing with age

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190287217A1 (en) * 2018-03-13 2019-09-19 Microsoft Technology Licensing, Llc Machine learning system for reduced network bandwidth transmission of content
US10825219B2 (en) * 2018-03-22 2020-11-03 Northeastern University Segmentation guided image generation with adversarial networks
US11403521B2 (en) * 2018-06-22 2022-08-02 Insilico Medicine Ip Limited Mutual information adversarial autoencoder

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107977629A (en) * 2017-12-04 2018-05-01 电子科技大学 A kind of facial image aging synthetic method of feature based separation confrontation network
CN109543745A (en) * 2018-11-20 2019-03-29 江南大学 Feature learning method and image-recognizing method based on condition confrontation autoencoder network
CN109949278A (en) * 2019-03-06 2019-06-28 西安电子科技大学 Hyperspectral abnormity detection method based on confrontation autoencoder network
CN110533620A (en) * 2019-07-19 2019-12-03 西安电子科技大学 The EO-1 hyperion and panchromatic image fusion method of space characteristics are extracted based on AAE
CN110852935A (en) * 2019-09-26 2020-02-28 西安交通大学 Image processing method for human face image changing with age
CN110827265A (en) * 2019-11-07 2020-02-21 南开大学 Image anomaly detection method based on deep learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A Method for Face Fusion Based on Variational Auto-Encoder;Xiang Li 等;《2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)》;20190204;第77-80页 *
基于分步生成模型的视网膜眼底图像合成;康莉等;《中国体视学与图像分析》;20191225(第04期);第362-370页 *
基于生成对抗网络的异质人脸图像合成:进展与挑战;黄菲等;《南京信息工程大学学报(自然科学版)》;20191128(第06期);第660-681页 *
高斯核方向导数和RILPQ融合的人脸表情识别;张鹏鹏等;《计算机与数字工程》;20171020(第10期);第2013-2017页 *

Also Published As

Publication number Publication date
CN111402179A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
CN111402179B (en) Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network
CN110874842B (en) Chest cavity multi-organ segmentation method based on cascade residual full convolution network
CN110070935B (en) Medical image synthesis method, classification method and device based on antagonistic neural network
CN111932444A (en) Face attribute editing method based on generation countermeasure network and information processing terminal
CN113343705A (en) Text semantic based detail preservation image generation method and system
CN112686898B (en) Automatic radiotherapy target area segmentation method based on self-supervision learning
CN110084193B (en) Data processing method, apparatus, and medium for face image generation
CN111652049A (en) Face image processing model training method and device, electronic equipment and storage medium
CN108932536A (en) Human face posture method for reconstructing based on deep neural network
CN112967178B (en) Image conversion method, device, equipment and storage medium
CN111476241B (en) Character clothing conversion method and system
CN113724354B (en) Gray image coloring method based on reference picture color style
CN110852935A (en) Image processing method for human face image changing with age
CN113160032A (en) Unsupervised multi-mode image conversion method based on generation countermeasure network
Song et al. SP-GAN: Self-growing and pruning generative adversarial networks
CN112686817A (en) Image completion method based on uncertainty estimation
CN113538608A (en) Controllable character image generation method based on generation countermeasure network
CN112288645A (en) Skull face restoration model construction method, restoration method and restoration system
CN114677263A (en) Cross-mode conversion method and device for CT image and MRI image
CN111414928A (en) Method, device and equipment for generating face image data
CN116012255A (en) Low-light image enhancement method for generating countermeasure network based on cyclic consistency
CN111368734A (en) Micro expression recognition method based on normal expression assistance
CN114463214A (en) Double-path iris completion method and system guided by regional attention mechanism
CN110415261A (en) A kind of the expression animation conversion method and system of subregion training
CN113379606A (en) Face super-resolution method based on pre-training generation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220809

CF01 Termination of patent right due to non-payment of annual fee