CN111402179B

CN111402179B - Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network

Info

Publication number: CN111402179B
Application number: CN202010169306.5A
Authority: CN
Inventors: 张桂梅; 胡强
Original assignee: Nanchang Hangkong University
Current assignee: Nanchang Hangkong University
Priority date: 2020-03-12
Filing date: 2020-03-12
Publication date: 2022-08-09
Anticipated expiration: 2040-03-12
Also published as: CN111402179A

Abstract

The invention discloses an image synthesis method and system combining a countermeasure autoencoder and a countermeasure network generation. The method includes constructing an enhanced countermeasure automatic encoder including two different sets of encoders, two different sets of first discriminators, and a set of decoders; constructing an improved conditional generation countermeasure network comprising a generator and a second discriminator; taking the manually segmented blood vessel tree image and the original fundus retina image as training data, and performing iterative training on a combined enhanced countermeasure automatic encoder and an improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator; and performing fundus retina image synthesis on the to-be-processed artificial segmentation blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthesized image. The invention can generate sample data with higher precision and more diversified styles and effectively amplify limited training sample data.

Description

Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network

Technical Field

The present invention relates to the field of image processing, and more particularly, to an image synthesis method and system for combining a countermeasure autoencoder and a generation countermeasure network.

Background

In the field of medical application, a medical image processing algorithm with superior performance needs a large amount of effective medical image data with specific labeling information as training sample data. In the actual treatment process, medical equipment has the reasons that the human health is damaged by large radiation, the privacy of a patient is involved, and the like, so that the direct acquisition of medical image data is difficult, and medical image data samples of different organs of a human body are very rare; meanwhile, the complexity of the focus caused by different human organs is different, the difficulty of manually marking the effective characteristic information label is very high, and the cost is also very high. Therefore, in order to promote the improvement of medical technology level such as medical image segmentation, medical image recognition and medical image registration, the synthesis of medical image amplification sample data set is receiving a lot of attention.

By integrating the current research situation of medical image data amplification technology at home and abroad, the related research methods can be divided into two categories: the method is based on a traditional medical image amplification method and a deep learning-based medical image amplification method. The traditional amplification method comprises rigid transformation, such as translation, turnover, rotation, scaling, affine transformation and the like; non-rigid transformations, such as elastic transformations; the rigid transformation is to transform the image by setting a plurality of fixed transformation mapping matrixes, and finally the obtained amplified data set only comprises a plurality of preset corresponding transformation relations; the non-rigid transformation is that a corresponding constraint range is set for pixel points, such as the rotation angle of a local pixel point, the translation distance of the pixel point and the like, although the data set can be amplified, the diversity change can also be presented, but the transformed image does not have a specific evaluation standard, the control data is difficult to be transformed to a reasonable range, and the obtained amplified data is unstable. Although the method based on traditional medical image amplification can amplify limited data samples in quantity, the diversity of the expanded data is not changed greatly essentially, for example, the rotation in the true sense is according to a certain angle, each pixel point in the image is rotated in 3D, and not only the image itself is rotated according to the angle.

In order to solve the problems and limitations existing in the conventional data amplification method, in 2014, the related scholars propose a generation confrontation network synthesis method for generating amplification sample data similar to a real data set in pixel gray scale and structure, without depending on a large number of labeled samples during training, wherein the synthesis method generates the optimal output by mutually gaming a generator network and a discriminator network. However, the actual training difficulty for generating the countermeasure network is extremely high, and the problems of gradient loss, overfitting and the like are easy to occur. To alleviate this problem, the relevant scholars propose a method of generating a countermeasure network by least squares, and construct a loss function by a least square method with a more strict convergence condition. In order to enable the generation of the confrontation network synthesis method to generate the required synthesis sample more purposefully and more efficiently, the related scholars propose a conditional generation confrontation network synthesis method, and the synthesis of the image is realized by providing a small amount of label information for a training model.

In summary, although the traditional medical data amplification method can increase the limited data set in quantity, the amplification image lacks diversity variation; although the medical image amplification method based on deep learning can amplify limited training samples in quantity and diversity, the training process of the synthetic network lacks stability, and a synthetic model needs to be further optimized.

Disclosure of Invention

Based on this, it is necessary to provide an image synthesis method and system combining an antagonistic self-encoder and generation of an antagonistic network, which can perform sample data amplification on a limited medical image data set, enrich a medical image database, improve the stability of a training process of a synthetic network, and improve the precision and generalization performance of image synthesis, thereby promoting the precision and generalization performance improvement of medical technologies such as medical image segmentation, medical image registration, medical image recognition and the like.

In order to achieve the purpose, the invention provides the following scheme:

an image synthesis method incorporating a countermeasure autoencoder and generating a countermeasure network, comprising:

constructing an enhanced countermeasure autoencoder; the enhanced countermeasure automatic encoder comprises two groups of encoders of different classes, two groups of first discriminators of different classes and a group of decoders; the two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for manually segmenting the blood vessel tree image according to input information to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image;

constructing an improved conditional generation countermeasure network; the improved conditional generation countermeasure network comprises a generator and a second discriminator; the generator is used for generating a reconstructed fundus retina image according to the artificial segmentation retina outline mask and the reconstructed blood vessel tree image output by the structure enhanced countermeasure automatic encoder; the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training;

taking the artificially segmented blood vessel tree image and the original fundus retina image as training data, and performing iterative training on the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator;

and performing fundus retina image synthesis on the to-be-processed artificial segmentation blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthesized image.

Optionally, the constructing an enhanced countermeasure automatic encoder specifically includes:

constructing a content encoder; the content encoder comprises a first down-sampling layer and a first residual block network which are connected in sequence; the first down-sampling layer comprises a convolution layer and a first convolution block; the first network of residual blocks comprises a plurality of first standard residual blocks; the convolution layer comprises a convolution kernel and an activation function layer; the first convolution block comprises a convolution kernel, an instance normalization layer and an activation function layer; the first standard residual block comprises two first convolution blocks connected in a jumping-over manner;

constructing a style encoder; the style encoder comprises a second down-sampling layer, a global average pooling layer and a full-connection layer which are connected in sequence; the second downsampling layer includes a plurality of the convolutional layers;

constructing a content coding discriminator; the content coding discriminator comprises a plurality of first convolution blocks which are connected in sequence;

constructing a style code discriminator; the style coding discriminator comprises a plurality of second volume blocks which are connected in sequence;

constructing a decoder; the decoder comprises a second residual block network, a multilayer perceptron and an upsampling layer; the second network of residual blocks comprises a plurality of first standard residual blocks; the first standard residual block comprises two second convolution blocks connected in a jumping-over manner; said second convolution block includes a convolution layer, an adaptive instance normalization layer, and an activation function layer; the multilayer perceptron is used for outputting self-adaptive example normalized style parameters corresponding to the style coding vectors; the up-sampling layer comprises an anti-convolution layer and an activation function layer; the number of upsampling layers matches the sum of the number of first downsampling layers and the number of second downsampling layers.

Optionally, constructing the improved conditional generation countermeasure network specifically includes:

constructing a generator; the generator comprises a convolution layer, a deconvolution layer and a channel-by-channel full connection layer, and the connection mode of the generator is jump connection; the number of the convolution layers is matched with that of the deconvolution layers; the convolution layer comprises a convolution kernel and an activation function layer; the deconvolution layer comprises a deconvolution kernel and an activation function layer;

constructing a second discriminator; the second discriminator comprises a convolution layer, a full connection layer and a binary layer corresponding to the discriminator; and the convolution layer corresponding to the discriminator comprises a combined batch normalization layer and an activation function layer.

Optionally, the performing iterative training on the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network by using the artificially segmented blood vessel tree image and the original fundus retinal image as training data to obtain an optimal blood vessel tree image generator and an optimal fundus retinal image generator specifically includes:

acquiring an artificial segmentation blood vessel tree image and an original fundus retina image to obtain training data; the artificial segmented blood vessel tree image corresponds to the original fundus retina image;

respectively taking the manually segmented vessel tree image as the input of the content encoder and the style encoder to obtain a content coding vector and a style coding vector, and taking the content coding vector and the style coding vector as the input of the decoder to obtain a reconstructed vessel tree image;

taking the content coding vector and the prior content coding vector as the input of the content coding discriminator to distinguish, and taking the style coding vector and the prior style coding vector as the input of the style coding discriminator to distinguish;

constructing an encoder reconstruction loss function and a first discriminator countervailing loss function according to the artificially segmented vessel tree image and the reconstructed vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is a confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder;

using the artificial segmented retina outer contour mask, the reconstructed blood vessel tree image, the artificial segmented blood vessel tree image and the original fundus retina image as the input of the generator to generate a reconstructed fundus retina image;

taking the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificial segmented blood vessel tree image and the original fundus retina image as input of a second discriminator to carry out judgment;

constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates a confrontation loss function for the second arbiter in the confrontation network for the improved conditional; the generator global consistency loss function is a global consistency loss function of the generator in the countermeasure network generated by the improved conditional expression;

obtaining a total loss function of a synthetic model according to the reconstruction loss function of the encoder, the countermeasure loss function of the first discriminator, the countermeasure loss function of the second discriminator and the global consistency loss function of the generator;

performing combined training on the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation mode according to the total loss function of the synthetic model, so that parameters in the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator; the optimal vascular tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator is a trained improved conditional generation countermeasure network.

Optionally, the obtaining a total loss function of a synthetic model according to the encoder reconstruction loss function, the first discriminator countermeasure loss function, the second discriminator countermeasure loss function, and the generator global consistency loss function specifically includes:

obtaining a total loss function of the confrontation automatic encoder according to the encoder reconstruction loss function and the first discriminator confrontation loss function;

obtaining a total loss function of the generated countermeasure network according to the countermeasure loss function of the second discriminator and the global consistency loss function of the generator;

and linearly adding the total loss function of the countermeasure automatic encoder and the total loss function of the generated countermeasure network to obtain a total loss function of the synthetic model.

Optionally, the antagonistic autoencoder total loss function is

Wherein,

to combat the auto-encoder total loss function;

in order to be a content encoding discriminator,

as a style code discriminator, Q ^c Being a content encoder, Q ^s Is a style encoder, and R is a decoder;

for the corresponding countermeasures of the content encoding discriminator,

the corresponding countermeasure loss of the style coding discriminator; l is _Recon Reconstructing losses for the encoder; the first discriminator includes

And

λ ₀ for countering the weight parameter, λ, corresponding to the total loss function of the automatic encoder ₀ For balancing countermeasure and reconstruction losses;

wherein v represents an artificially segmented vessel tree image;

representing the expected value, P, corresponding to the image of the artificially segmented vessel tree _data (v) Data distribution representing an image of an artificially segmented vessel tree; q ^c (v) content encoder with input v, Q ^s (v) a style encoder for inputting an image of the artificially segmented vessel tree;

wherein z is _c Representing content-encoded vectors, z _s Representing a stylized code vector, P (z) _c ) Representing a prior distribution, P (z), of artificially added content latent variables _s ) Representing the prior distribution of the artificially added style latent variable;

indicating the expected value to which the content encoding vector corresponds,

representing expected values corresponding to the style code vectors; q ^c (z _c | v) represents a content encoding distribution function, Q ^s (z _s | v) represents a stylistic code distribution function, Q ^c (z _c | v) for obtaining a content-encoded vector, Q ^s (z _s | v) is used for obtaining the style encoding vector;

is input as z _c The content encoding discriminator of (1),

is input as z _s The style code discriminator of (1);

as an input of Q ^c (z _c | v) content encoding discriminator,

as an input of Q ^s (z _s | v) style encoding discriminator.

Optionally, the generating of the total loss function of the countermeasure network is

Wherein L is _im2im (G, D) generating an antagonistic network total loss function; g represents a generator in the improved conditional generation countermeasure network, and D represents a second discriminator in the improved conditional generation countermeasure network; l is _adv (G, D) represents a second discriminator opposition loss function;

representing a generator global consistency loss function;

representing expected values corresponding to the artificial segmentation blood vessel tree image and the original fundus retina image; v denotes an artificially segmented blood vessel tree image, r denotes an original fundus retina image,

representing a reconstructed blood vessel tree image, wherein m is a retina outline mask; lambda [ alpha ] ₁ To generate a weight parameter, λ, corresponding to the overall loss function of the reactive network ₁ The global consistency loss of the generator and the countermeasure loss of the second discriminator are balanced;

is expressed as input

And a generator of m;

wherein,

representing the expected value corresponding to the reconstructed blood vessel tree image; d (m, (v, r)) represents a second discriminator with inputs m, v, and r;

the input is m,

And

the second discriminator of (1);

representing inputs as m and

the generator of (1).

Optionally, the total loss function of the synthetic model is

Wherein,

representing the composite model total loss function.

The present invention also provides an image synthesis system incorporating a countermeasure autoencoder and a generation countermeasure network, comprising:

an enhanced countermeasure autoencoder construction module for constructing an enhanced countermeasure autoencoder; the enhanced countermeasure automatic encoder comprises two groups of encoders of different classes, two groups of first discriminators of different classes and a group of decoders; the two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for segmenting the blood vessel tree image according to input manual work to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image;

the improved conditional generation countermeasure network construction module is used for constructing an improved conditional generation countermeasure network; the improved conditional generation countermeasure network comprises a generator and a second discriminator; the generator is used for generating a reconstructed fundus retina image according to the artificial segmentation retina outline mask and the reconstructed blood vessel tree image output by the structure enhanced countermeasure automatic encoder; the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training;

the model training module is used for taking the artificially segmented blood vessel tree image and the original fundus retina image as training data, and performing iterative training on the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator;

and the synthetic image determining module is used for carrying out fundus retina image synthesis on the to-be-processed artificial segmentation blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthetic image.

Optionally, the model training module specifically includes:

the training data acquisition unit is used for acquiring an artificial segmentation blood vessel tree image and an original fundus retina image to obtain training data; the artificial segmented blood vessel tree image corresponds to the original fundus retina image;

the vessel tree image reconstruction unit is used for respectively taking the artificially segmented vessel tree image as the input of the content encoder and the style encoder to obtain a content coding vector and a style coding vector, and taking the content coding vector and the style coding vector as the input of the decoder to obtain a reconstructed vessel tree image;

a coding judgment unit configured to perform discrimination by using the content coding vector and the prior content coding vector as inputs of the content coding discriminator, and perform discrimination by using the style coding vector and the prior style coding vector as inputs of the style coding discriminator;

an enhanced countermeasure automatic encoder loss function establishing unit, configured to construct an encoder reconstruction loss function and a first discriminator countermeasure loss function according to the artificially segmented vessel tree image and the reconstructed vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is a confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder;

a fundus retina image reconstruction unit, configured to generate a reconstructed fundus retina image by using the artificial segmented retina outer contour mask, the reconstructed blood vessel tree image, the artificial segmented blood vessel tree image, and the original fundus retina image as the input of the generator;

a fundus retina image discrimination unit, configured to use the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificially segmented blood vessel tree image, and the original fundus retina image as input of a second discriminator to perform discrimination;

a generation countermeasure network loss function establishing unit for constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates the confrontation loss function of the second arbiter in the confrontation network for the improved conditional; the generator global consistency loss function is a global consistency loss function of the generator in the countermeasure network generated by the improved conditional expression;

a synthetic model total loss function establishing unit, configured to obtain a synthetic model total loss function according to the encoder reconstruction loss function, the first discriminator countermeasure loss function, the second discriminator countermeasure loss function, and the generator global consistency loss function;

an optimal model determining unit, configured to perform joint training on the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation manner according to the total loss function of the synthetic model, so that parameters in the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal vessel tree image generator and an optimal fundus retina image generator; the optimal vessel tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator generates a countermeasure network for a trained improved conditional expression.

Compared with the prior art, the invention has the beneficial effects that:

the invention has proposed combining and antagonizing the self-encoder and generating the picture synthetic method and system of the network of antagonism, this method constructs the enhancement mode including encoder, first arbiter and a pack of demoders of two packs of different categories of encoders at first, construct the after-improving conditional expression including generator and second arbiter and generate the network of antagonism; then, the artificially segmented blood vessel tree image and the original fundus retina image are used as training data, and the constructed model is subjected to iterative training to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator; and finally, obtaining synthetic sample data with higher precision and more diversified styles based on the optimal model. By adopting the method or the system, sample data amplification can be carried out on a limited medical image data set, a medical image database is enriched, the stability of a training process of a synthetic network can be improved, the precision and the generalization performance of image synthesis are improved, and the precision and the generalization performance of medical technologies such as medical image segmentation, medical image registration and medical image identification are further improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a flow chart of an image synthesis method incorporating a countermeasure autoencoder and generating a countermeasure network in accordance with an embodiment of the present invention;

FIG. 2 is a block diagram of an enhanced countermeasure autoencoder in accordance with an embodiment of the present invention;

FIG. 3 is a block diagram of an encoder in the enhanced countermeasure auto-encoder according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating a first discriminator of an enhanced countermeasure auto-encoder according to an embodiment of the present invention;

FIG. 5 is a block diagram of a decoder in an enhanced countermeasure auto-encoder according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a generator in a conditionally generated countermeasure network after modification according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of a second discriminator in the conditionally generated countermeasure network according to an embodiment of the present invention;

FIG. 8 is a block diagram of an image composition model incorporating an enhanced countermeasure autoencoder and an improved conditional generation countermeasure network in accordance with an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an image synthesis system combining a countermeasure self-encoder and a generation countermeasure network according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

FIG. 1 is a flow chart of an image synthesis method incorporating a countermeasure autoencoder and a generation countermeasure network according to an embodiment of the present invention. Referring to fig. 1, the image synthesis method combining a countermeasure autoencoder and a countermeasure network of the present embodiment includes:

step 101: and acquiring an artificial segmentation blood vessel tree image and an original fundus retina image to obtain training data.

The artificial segmented blood vessel tree image corresponds to the original fundus retina image.

Step 102: constructing an enhanced countermeasure autoencoder; the enhanced countermeasure autoencoder includes two different sets of encoders of different classes, two different sets of first discriminators of different classes, and a set of decoders.

The two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for segmenting the blood vessel tree image according to input manual work to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; and the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image. The specific structure of the enhanced countermeasure automatic encoder is shown in fig. 2.

Step 103: constructing an improved conditional generation countermeasure network; the improved conditional generation countermeasure network includes a generator and a second discriminator.

Wherein the generator is used for generating a reconstructed fundus retina image according to the artificial segmentation retina outline mask and the reconstructed blood vessel tree image output by the structure enhanced countermeasure automatic encoder; and the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training.

Step 104: and taking the artificially segmented blood vessel tree image and the original fundus retina image as training data, and performing iterative training on the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator.

Step 105: and performing fundus retina image synthesis on the to-be-processed artificial segmentation blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthesized image.

Step 102 is a process of reconstructing the blood vessel tree image in this embodiment, and is performed under a deep learning tensflo framework. The main purposes of the vessel tree image reconstruction are: in order to obtain reconstructed vessel tree images with more diversified characteristics, content coding vectors and style coding vectors in the artificially segmented vessel tree images are extracted through encoders of different types, and the coding vectors of the two types are recombined. Target of vessel tree image reconstruction: and amplifying the blood vessel tree image data set to prepare for synthesizing the fundus retina image at the later stage. The step 102 specifically rebuilds as follows:

11) a content encoder and a genre encoder are constructed.

The content encoder comprises a first down-sampling layer and a first residual block network which are connected in sequence; the first down-sampling layer comprises a convolution layer and a first convolution block; the first network of residual blocks comprises a plurality of first standard residual blocks; the convolution layer comprises a convolution kernel and an activation function layer; the first convolution block includes a convolution kernel, an instance normalization layer, and an activation function layer; the first standard residual block includes two of the first convolution blocks connected in a jump. The style encoder comprises a second down-sampling layer, a global average pooling layer and a full-connection layer which are connected in sequence; the second downsampling layer includes a plurality of the convolutional layers.

In order to fully extract the characteristic information in the artificially segmented blood vessel tree image, the content encoder and the style encoder in the embodiment respectively adopt different structures of convolutional neural networks; the content encoder adopts a standard residual error block structure and is used for acquiring a content coding vector in the artificially segmented blood vessel tree image, the content coding vector has domain invariance and is a carrier for connecting the artificially segmented blood vessel tree image and the reconstructed blood vessel tree image to determine the structural similarity between the artificially segmented blood vessel tree image and the reconstructed blood vessel tree image; the style encoder adopts a combined structure of convolution blocks, global tie pooling and full connection, is used for obtaining style encoding vectors in the manually segmented blood vessel tree image, has heterofield uniqueness, is a guide factor for enriching the diversity of the reconstructed blood vessel tree image, and enriches the characteristics of the reconstructed blood vessel tree image.

The structure of the content encoder and the genre encoder is shown in fig. 3:

specifically, the content encoder employed includes: sequentially connecting an artificially segmented vessel tree image, a convolution layer, two convolution blocks and four standard residual blocks (each residual block comprises two convolution blocks); the first convolutional layer uses 64 convolution kernels 7 by 7 to connect the input images and uses a step size of 1, and the convolutional layer is composed of convolution kernels and an activation function; the step pitch of the first volume block and the step pitch of the second volume block are both 2, convolution kernels with the size of 4 x 4 are adopted, and the corresponding channel numbers are 128 and 256 respectively; the four standard residual blocks are identical in structure, and only the first standard residual block is shown in fig. 3, which contains two convolution blocks, each consisting of a convolution kernel, an instance normalization and activation function, and using 256 3 × 3 convolution kernels, with a step size of 2.

Specifically, the adopted style encoder comprises: sequentially connecting the artificially segmented blood vessel tree image, five convolution layers, a global average pooling layer and a full-connection layer; the first convolutional layer is the same as the first convolutional layer in the content encoder; the second convolution layer to the fourth convolution layer adopt convolution kernels with the size of 4 x 4 and the step pitch of 2, and the corresponding channel numbers are 128, 256 and 256 in sequence; the convolution kernel size of the global average pooling layer is 2 x 2, and the step size is 2; the number of channels of the fully connected layer is 8.

12) And constructing a content coding discriminator and a format coding discriminator. The content encoding discriminator includes a plurality of the first convolution blocks connected in sequence. The style coding discriminator comprises a plurality of second volume blocks which are connected in sequence.

In order to carry out discrimination constraint on the precision of the content coding vectors and the style coding vectors acquired by the encoders of different types, the precision of the extracted coding vectors of different types is optimal; in the embodiment, a content coding discriminator and a trellis coding discriminator are constructed, the structure of two discriminators of different types is the same, and the structure of a rolling block is adopted; and respectively taking the extracted content coding vector and style coding vector and corresponding artificially acquired prior content coding vector and artificially acquired prior style coding vector as the input of a content coding discriminator and a format coding discriminator, discriminating the difference degree between the content coding discriminator and the format coding discriminator, and outputting an evaluation feedback signal to perform parameter optimization on encoders of different types so as to optimize the encoders.

The structure of the content encoding discriminator and the trellis encoding discriminator is shown in fig. 4:

specifically, the adopted content coding discriminator and the adopted trellis coding discriminator have the same structure and comprise: sequentially connecting a content coding vector (style coding vector), a manually acquired prior content coding vector (manually acquired prior style coding vector) and four volume blocks; the four convolution blocks are similar in structure, convolution kernels with the size of 4 x 4 and the step pitch of 2 are adopted, and the corresponding channel numbers are 64, 128, 256 and 512 respectively; each convolution block contains a convolution kernel, an instance normalization and an activation function.

13) Constructing a decoder; the decoder comprises a second residual block network, a multilayer perceptron and an upsampling layer; the second network of residual blocks comprises a plurality of first standard residual blocks; the first standard residual block comprises two second convolution blocks connected in a jumping-over manner; said second convolution block includes a convolution layer, an adaptive instance normalization layer, and an activation function layer; the multilayer perceptron is used for outputting self-adaptive example normalized style parameters corresponding to the style coding vectors; the up-sampling layer comprises an anti-convolution layer and an activation function layer; the number of upsampling layers matches the sum of the number of first downsampling layers and the number of second downsampling layers.

In order to recombine different types of coding vectors and obtain reconstructed vessel tree images with various characteristics, the present embodiment combines the same standard residual block structure and the multilayer perceptron as those in the content encoder to construct a decoder structure; the multilayer perceptron is used for extracting self-adaptive example normalized style parameters in the style coding vector and enriching the diversity of the reconstructed blood vessel tree image as a guide factor; the combination of the four standard residual blocks and the three convolutional layers is used for generating a reconstructed vessel tree image.

The structure of the decoder is shown in fig. 5:

specifically, the decoder structure employed includes: the content coding vector is connected with four standard residual blocks, two up-sampling layers and a convolution layer; the style coding vector is connected with the multilayer perceptron and then combined with the output results of the four residual blocks; the structure of the four standard residual blocks is the same as that in the content encoder; each standard residual block comprises two convolution blocks, and each convolution block consists of a convolution kernel, an adaptive instance normalization function and an activation function; the two upper sampling layers have the same structure, the convolution kernels with the size of 5 x 5 and the step pitch of 2 are adopted, the corresponding channel numbers are 128 and 64 respectively, and each upper sampling layer consists of the convolution kernels and an activation function; the last convolution layer adopts convolution kernels with the size of 7 x 7 and the step pitch of 2, and the number of output channels is 3; and the style coding vector is used as the input of the multilayer perceptron, and the self-adaptive instance normalization parameter is obtained and used for enriching the reconstructed blood vessel tree image.

The main reasons why the standard residual block is used by both the content encoder and the decoder in the enhanced countermeasure autoencoder of the present embodiment are as follows:

(1) the problem that gradient dispersion occurs due to the fact that the number of layers of the training model is deeper can be well relieved, and network weight parameters of the shallow part can be better trained.

(2) The jump connection in the standard residual block can effectively fuse the characteristic information between the low layer and the high layer, so that the gradient information can smoothly pass through each residual block.

Here, step 103 is a synthesizing process of the fundus retinal image in the present embodiment, and is also performed in a tensoflow frame for deep learning. The main purposes of fundus retinal image synthesis are: in order to acquire more diversified reconstructed fundus retinal images structurally similar to the original fundus retinal image. Target of fundus retinal image synthesis: a limited fundus retinal training data set is expanded. The specific synthesis in step 103 is as follows:

21) constructing a generator; the generator comprises a convolution layer, a deconvolution layer and a channel-by-channel full connection layer, and the connection mode of the generator is jump connection; the number of the convolution layers is matched with that of the deconvolution layers; the convolution layer comprises a convolution kernel and an activation function layer; the deconvolution layer includes a deconvolution kernel and an activation function layer.

In order to generate reconstructed fundus retina images which are various in characteristics, high in fidelity and similar to original fundus retina images in structure, the generator network of the embodiment adopts a convolution neural network structure similar to U-Net; as the depth of the model is increased, part of effective characteristic information in the reconstructed image is lost, and the lost part of effective information is compensated by adding jump connection and fusing the characteristic information of the low layer and the high layer; the fundus retina outline mask is added to the improved conditional generation countermeasure network, and fundus retina images can be acquired more efficiently.

The structure of the generator is shown in fig. 6:

specifically, the generator network structure adopted includes: sequentially connecting the fundus retina outline mask with the reconstructed blood vessel tree image, four convolutional layers, a channel-by-channel full-connection layer and four deconvolution layers. The first convolution layer adopts convolution kernels with the size of 7 × 7 and the step pitch of 2, the second convolution layer and the third convolution layer both adopt convolution kernels with the size of 5 × 5 and the step pitch of 2, the fourth convolution layer adopts convolution kernels with the size of 3 × 3 and the step pitch of 2, and the number of channels corresponding to the four convolution layers is 64, 128, 256 and 512 respectively; the number of channels of the channel-by-channel full connection layer is 512; the four deconvolution layers and the four convolution layers are symmetrical in structure, parameters of the first deconvolution layer are the same as those of the fourth convolution layer, parameters of the second deconvolution layer and the third deconvolution layer are the same as those of the third convolution layer and the second convolution layer, and parameter settings of the fourth deconvolution layer and the first convolution layer are the same.

22) Constructing a second discriminator; the second discriminator comprises a convolution layer, a full connection layer and a binary layer corresponding to the discriminator; and the convolution layer corresponding to the discriminator comprises a combined batch normalization layer and an activation function layer. And the two classification layers are used for distinguishing the real image from the generated image according to the output characteristics of the full connection layer.

In order to ensure that the reconstructed fundus retinal image is more real in vision and is similar to the original fundus retinal image in structure, a two-classification discriminator network is constructed for discriminating whether the input image is the original fundus retinal image or the reconstructed fundus retinal image.

The structure of the discriminator is shown in FIG. 7:

specifically, the second discriminator includes: sequentially connecting the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificially segmented blood vessel tree image, the original fundus retina image, four convolution layers, a full connection layer and a binary classification layer; the four convolution layers have the same structure, the convolution kernels with the size of 5 x 5 and the step pitch of 2 are adopted, and the corresponding channel numbers are 64, 128, 256 and 512 respectively; the number of channels of the full connection layer is 512; the number of channels in the second category is 1024.

The reason why the generator network in the improved conditional generation countermeasure network of the embodiment adopts a symmetrical structure similar to "U-Net" is as follows:

(1) the distribution of the symmetrical structure can effectively fuse the characteristic information between the low layer and the high layer by adding jump connection, and make up the problem of effective information loss caused by deeper model network layers.

(2) The U-Net network has unique structure and strong adaptability, and can acquire a model with better performance by training on a limited data set.

In step 104, an iterative training process is performed on the combination enhanced confrontation automatic encoder and the improved conditional generation confrontation network synthesis model based on the training data set of the artificial segmentation blood vessel tree image and the original fundus retina image in the embodiment. The specific iterative training step in step 104 is as follows:

31) acquiring an artificially segmented blood vessel tree image and an original fundus retina image to obtain training data; the artificial segmented blood vessel tree image corresponds to the original fundus retina image.

32) And respectively taking the manually segmented vessel tree image as the input of the content encoder and the style encoder to obtain a content coding vector and a style coding vector, and taking the content coding vector and the style coding vector as the input of the decoder to obtain a reconstructed vessel tree image.

33) And taking the content coding vector and the prior content coding vector as the input of the content coding discriminator to distinguish, and taking the style coding vector and the prior style coding vector as the input of the style coding discriminator to distinguish.

34) Constructing an encoder reconstruction loss function and a first discriminator countervailing loss function according to the artificially segmented blood vessel tree image and the reconstructed blood vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is the confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder.

35) And generating a reconstructed fundus retina image by taking the artificial segmented retina outer contour mask, the reconstructed blood vessel tree image, the artificial segmented blood vessel tree image and the original fundus retina image as the input of the generator.

36) And taking the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificial segmentation blood vessel tree image and the original fundus retina image as the input of a second discriminator for judgment.

37) Constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates a confrontation loss function for the second arbiter in the confrontation network for the improved conditional; the producer global consistency loss function generates a global consistency loss function for producers in the countermeasure network for the improved conditional.

38) And obtaining a total loss function of a synthetic model according to the reconstruction loss function of the encoder, the countermeasure loss function of the first discriminator, the countermeasure loss function of the second discriminator and the global consistency loss function of the generator. The method specifically comprises the following steps:

obtaining a total loss function of the confrontation automatic encoder according to the encoder reconstruction loss function and the first discriminator confrontation loss function; obtaining a total loss function of the generated countermeasure network according to the countermeasure loss function of the second discriminator and the global consistency loss function of the generator; and linearly adding the total loss function of the countermeasure automatic encoder and the total loss function of the generated countermeasure network to obtain a total loss function of the synthetic model.

39) Performing combined training on the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation mode according to the total loss function of the synthetic model, so that parameters in the combined enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator; the optimal vascular tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator is a trained improved conditional generation countermeasure network; the optimal blood vessel tree image generator and the optimal fundus retinal image generator constitute a final overall composite model, which is shown in fig. 8.

In the step, in the process of continuous updating and optimization, the whole synthesis model can generate a series of reconstructed blood vessel tree images and reconstructed fundus retina images, the discriminator network determines the structural proximity degree of the reconstructed fundus retina images and the original fundus retina images by learning the reconstructed fundus retina images and the original fundus retina images, and a feedback signal for synthesis quality evaluation is output; the blood vessel tree image generator network and the fundus retina image generator network can optimize network parameters according to the feedback signals, so that reconstructed fundus retina images with more diversified characteristics can be obtained, and finally, the optimal blood vessel tree image generator network and the optimal fundus retina image generator network are obtained.

Wherein the countering auto-encoder total loss function in step 38) is

Wherein,

to combat the auto-encoder total loss function;

in order to be a content encoding discriminator,

for the corresponding countermeasures of the content encoding discriminator,

And

λ ₀ for countering the weight parameter, λ, corresponding to the total loss function of the automatic encoder ₀ For balancing the countermeasure and reconstruction losses.

Wherein v represents an artificially segmented vessel tree image;

representing the expected value, P, corresponding to the image of the artificially segmented vessel tree _data (v) Data distribution representing an image of an artificially segmented vessel tree; q ^c (v) content encoder with input v, Q ^s And (v) is a style encoder which is input to artificially segment the blood vessel tree image.

Wherein z is _c Representing a content encoding vector, z _s Representing a stylized code vector, P (z) _c ) Representing a prior distribution, P (z), of artificially added content latent variables _s ) Representing the prior distribution of the artificially added style latent variable;

indicating the expected value to which the content encoding vector corresponds,

is input as z _c The content encoding discriminator of (1),

is input as z _s The style code discriminator of (1);

as an input of Q ^c (z _c | v) content encoding discriminator,

as an input of Q ^s (z _s | v) style encoding discriminator.

Said generating in step 38) opposes a network total loss function of

representing a generator global consistency loss function;

representing expected values corresponding to the artificial segmentation blood vessel tree image and the original fundus retina image; v represents an artificial segmentation blood vessel tree image, r represents an original fundus retina image, v represents a reconstructed blood vessel tree image, and m is a retina outline mask; lambda [ alpha ] ₁ To generate a weight parameter, λ, corresponding to the overall loss function of the reactive network ₁ The global consistency loss of the generator and the countermeasure loss of the second discriminator are balanced;

representing the generators with inputs v and m.

Wherein,

the input is m,

And

the second discriminator of (2);

representing inputs as m and

the generator of (1).

The total loss function of the synthetic model in step 38) is

Wherein,

representing the composite model total loss function.

The image synthesis method combining the countermeasure self-encoder and the countermeasure network generation in the implementation can perform sample data amplification on a limited medical image data set, enrich a medical image database, improve the stability of a training process of a synthesis network, and improve the precision and generalization performance of image synthesis, thereby promoting the precision and generalization performance improvement of medical technologies such as medical image segmentation, medical image registration and medical image recognition.

In addition, the embodiment can directly give quantitative evaluation parameters by constructing the countermeasure loss function and the global consistency loss function in the improved conditional generation countermeasure network, and determine the structural closeness degree of the reconstructed fundus retina image and the original fundus retina image.

The invention also provides an image synthesis system combining the countermeasure self-encoder and the countermeasure network generation, and fig. 9 is a schematic structural diagram of the image synthesis system combining the countermeasure self-encoder and the countermeasure network generation according to the embodiment of the invention.

Referring to fig. 9, the image synthesizing system combining the countermeasure autoencoder and the generation countermeasure network of the present embodiment includes:

a training data obtaining module 201, configured to obtain an artificial segmented blood vessel tree image and an original fundus retina image to obtain training data.

An enhanced countermeasure autoencoder construction module 202 for constructing an enhanced countermeasure autoencoder; the enhanced countermeasure automatic encoder comprises two groups of encoders of different classes, two groups of first discriminators of different classes and a group of decoders; the two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for segmenting the blood vessel tree image according to input manual work to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; and the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image.

A modified conditional generation countermeasure network construction module 203 for constructing a modified conditional generation countermeasure network; the improved conditional generation countermeasure network comprises a generator and a second discriminator; the generator is used for generating a reconstructed fundus retina image according to the artificial segmented retina outline mask and the reconstructed blood vessel tree image output by the construction enhanced countermeasure automatic encoder; and the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training.

And the model training module 204 is configured to perform iterative training on the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network by using the artificially segmented blood vessel tree image and the original fundus retina image as training data to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator.

And a composite image determining module 205, configured to perform fundus retina image synthesis on the to-be-processed artificially segmented blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator, so as to obtain a composite image.

As an optional implementation, the enhanced countermeasure autoencoder constructing module 202 specifically includes:

a content encoder constructing unit for constructing a content encoder; the content encoder comprises a first down-sampling layer and a first residual block network which are connected in sequence; the first down-sampling layer comprises a convolution layer and a first convolution block; the first network of residual blocks comprises a plurality of first standard residual blocks; the convolution layer comprises a convolution kernel and an activation function layer; the first convolution block comprises a convolution kernel, an instance normalization layer and an activation function layer; the first standard residual block includes two of the first convolution blocks connected in a jump.

A style encoder constructing unit for constructing a style encoder; the style encoder comprises a second down-sampling layer, a global average pooling layer and a full-connection layer which are connected in sequence; the second downsampling layer includes a plurality of the convolutional layers.

A content encoding discriminator constructing unit for constructing a content encoding discriminator; the content encoding discriminator includes a plurality of the first convolution blocks connected in sequence.

A style code discriminator construction unit for constructing a style code discriminator; the style coding discriminator comprises a plurality of second volume blocks which are connected in sequence.

A decoder construction unit for constructing a decoder; the decoder comprises a second residual block network, a multilayer perceptron and an upsampling layer; the second network of residual blocks comprises a plurality of first standard residual blocks; the first standard residual block comprises two second convolution blocks connected in a jumping-over manner; said second convolution block includes a convolution layer, an adaptive instance normalization layer, and an activation function layer; the multilayer perceptron is used for outputting self-adaptive example normalized style parameters corresponding to the style coding vectors; the up-sampling layer comprises an anti-convolution layer and an activation function layer; the number of upsampling layers matches the sum of the number of first downsampling layers and the number of second downsampling layers.

A generator constructing unit for constructing a generator; the generator comprises a convolution layer, a deconvolution layer and a channel-by-channel full connection layer, and the connection mode of the generator is jump connection; the number of the convolution layers is matched with that of the deconvolution layers; the convolution layer comprises a convolution kernel and an activation function layer; the deconvolution layer includes a deconvolution kernel and an activation function layer.

A second discriminator construction unit for constructing a second discriminator; the second discriminator comprises a convolution layer, a full connection layer and a binary layer corresponding to the discriminator; and the convolution layer corresponding to the discriminator comprises a combined batch normalization layer and an activation function layer.

As an optional implementation manner, the model training module 204 specifically includes:

the training data acquisition unit is used for acquiring an artificial segmented blood vessel tree image and an original fundus retina image to obtain training data; the artificially segmented blood vessel tree image corresponds to the original fundus retina image.

And the vessel tree image reconstruction unit is used for respectively taking the artificially segmented vessel tree image as the input of the content encoder and the style encoder to obtain a content coding vector and a style coding vector, and taking the content coding vector and the style coding vector as the input of the decoder to obtain a reconstructed vessel tree image.

And an encoding judgment unit configured to perform discrimination by using the content encoding vector and the prior content encoding vector as inputs to the content encoding discriminator, and perform discrimination by using the genre encoding vector and the prior genre encoding vector as inputs to the genre encoding discriminator.

An enhanced countermeasure automatic encoder loss function establishing unit, configured to construct an encoder reconstruction loss function and a first discriminator countermeasure loss function according to the artificially segmented vessel tree image and the reconstructed vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is the confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder.

And the fundus retina image reconstruction unit is used for generating a reconstructed fundus retina image by taking the artificial segmentation retina outline mask, the reconstructed blood vessel tree image, the artificial segmentation blood vessel tree image and the original fundus retina image as the input of the generator.

And a fundus retina image discrimination unit for taking the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificially segmented blood vessel tree image and the original fundus retina image as input of a second discriminator to perform discrimination.

A generation countermeasure network loss function establishing unit for constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates the confrontation loss function of the second arbiter in the confrontation network for the improved conditional; the producer global consistency loss function generates a global consistency loss function for producers in the countermeasure network for the improved conditional.

And the synthetic model total loss function establishing unit is used for obtaining a synthetic model total loss function according to the encoder reconstruction loss function, the first discriminator countermeasure loss function, the second discriminator countermeasure loss function and the generator global consistency loss function.

An optimal model determining unit, configured to perform joint training on the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation manner according to the total loss function of the synthetic model, so that parameters in the combination enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal vessel tree image generator and an optimal fundus retina image generator; the optimal vascular tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator is a trained improved conditional generation countermeasure network.

As an optional implementation manner, the synthetic model total loss function establishing unit specifically includes:

a first total loss function determining subunit, configured to obtain a total loss function of the automatic counter encoder according to the encoder reconstruction loss function and the first discriminator counter loss function;

a second total loss function determining subunit, configured to obtain a generated confrontation network total loss function according to the second discriminator confrontation loss function and the generator global consistency loss function;

and the third total loss function determining subunit is used for performing linear summation on the total loss function of the countermeasure automatic encoder and the total loss function of the generated countermeasure network to obtain a total loss function of the synthetic model.

As an alternative embodiment, the first total loss function determines the total loss function of the countering automatic encoder in the sub-unit as

Wherein,

to combat the auto-encoder total loss function;

in order to be a content encoding discriminator,

for the corresponding countermeasures of the content encoding discriminator,

And

wherein v represents an artificially segmented vessel tree image;

representing the expected value, P, corresponding to the image of the artificially segmented vessel tree _data (v) Data distribution representing an artificially segmented vessel tree image; q ^c (v) content encoder with input v, Q ^s (v) a style encoder for inputting an image of the artificially segmented vessel tree;

indicating the expected value to which the content encoding vector corresponds,

representing expected values corresponding to the style code vectors; q ^c (z _c | v) represents a content encoding distribution function, Q ^s (z _s | v) represents a stylistic code distribution function, Q ^c (z _c | v) for obtaining content encoding vectors，Q ^s (z _s | v) is used for obtaining the style encoding vector;

is input as z _c The content encoding discriminator of (1),

is input as z _s The style code discriminator of (1);

as an input of Q ^c (z _c | v) content encoding discriminator,

as an input of Q ^s (z _s | v) style encoding discriminator.

As an optional implementation manner, the second total loss function determining subunit determines the total loss function of the generation countermeasure network as

representing a generator global consistency loss function;

a generator representing inputs v and m;

wherein,

the input is m,

And

the second discriminator of (1);

representing inputs as m and

the generator of (1).

As an alternative, the third total loss function determines the total loss function of the synthesis model in the subunit as

Wherein,

representing the composite model total loss function.

The image synthesis system combining the countermeasure autoencoder and the generation countermeasure network in the embodiment can perform sample data amplification on a limited medical image data set, enrich a medical image database, improve the stability of the training process of the synthesis network, improve the precision and generalization performance of image synthesis, and further promote the precision and generalization performance improvement of medical technologies such as medical image segmentation, medical image registration and medical image recognition.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principle and the embodiment of the present invention are explained by applying specific examples, and the above description of the embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. An image synthesis method incorporating a challenge autocoder and generating a challenge network, comprising:

constructing an enhanced countermeasure autoencoder; the enhanced countermeasure automatic encoder comprises two groups of encoders of different classes, two groups of first discriminators of different classes and a group of decoders; the two groups of encoders in different categories comprise a content encoder and a style encoder, and the first discriminators in the two groups of encoders in different categories comprise a content encoding discriminator and a style encoding discriminator; the two groups of encoders in different categories are used for segmenting the blood vessel tree image according to input manual work to obtain a content coding vector and a style coding vector of the manually segmented blood vessel tree image; the two groups of first discriminators of different categories are used for distinguishing and discriminating the content coding vectors and the prior content coding vectors acquired manually, distinguishing and discriminating the style coding vectors and the prior style coding vectors acquired manually, and performing reverse regulation training; the decoder is used for recombining the content coding vector and the style coding vector to obtain a reconstructed vessel tree image;

constructing an improved conditional generation countermeasure network; the improved conditional generation countermeasure network comprises a generator and a second discriminator; the generator is used for generating a reconstructed fundus retina image according to the artificial segmentation retina outline mask and the reconstructed blood vessel tree image output by the enhanced countermeasure automatic encoder; the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training;

taking the artificially segmented blood vessel tree image and an original fundus retina image as training data, and performing iterative training on the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator;

performing fundus retina image synthesis on the to-be-processed artificially segmented blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthesized image;

the step of performing iterative training on the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network by using the artificially segmented blood vessel tree image and the original fundus retinal image as training data to obtain an optimal blood vessel tree image generator and an optimal fundus retinal image generator specifically includes:

constructing an encoder reconstruction loss function and a first discriminator countervailing loss function according to the artificially segmented blood vessel tree image and the reconstructed blood vessel tree image; the encoder reconstruction loss function is a reconstruction loss function corresponding to the content encoder and the style encoder in the enhanced countermeasure automatic encoder; the first discriminator confrontation loss function is a confrontation loss function corresponding to the content coding discriminator and the style coding discriminator in the enhanced confrontation automatic encoder;

constructing a second discriminator countermeasure loss function and a generator global consistency loss function according to the original fundus retina image and the reconstructed fundus retina image; the second arbiter confrontation loss function generates the confrontation loss function of the second arbiter in the confrontation network for the improved conditional; the generator global consistency loss function is a global consistency loss function of the generator in the countermeasure network generated by the improved conditional expression;

performing combined training on the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation mode according to the total loss function of the synthetic model, so that parameters in the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator; the optimal vascular tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator is a trained improved conditional generation countermeasure network.

2. The method for image synthesis combining confrontation autoencoder and generation of confrontation network according to claim 1, characterized in that said construction of enhanced confrontation autoencoder includes:

3. The image synthesis method combining a competing autoencoder and a competing network according to claim 1, wherein the constructing a modified conditional competing network comprises:

4. The image synthesis method combining a countermeasure self-encoder and a countermeasure network generation according to claim 1, wherein the obtaining a synthesis model total loss function from the encoder reconstruction loss function, the first discriminator countermeasure loss function, the second discriminator countermeasure loss function, and the generator global coherence loss function specifically includes:

5. The method of image synthesis in combination with a countering autoencoder and a generation countering network according to claim 4, characterized in that the countering autoencoder total loss function is

Wherein,

to combat the auto-encoder total loss function;

in order to be a content encoding discriminator,

for the corresponding countermeasures of the content encoding discriminator,

And

wherein v represents an artificially segmented vessel tree image;

indicating the expected value to which the content encoding vector corresponds,

representing expected values corresponding to the style code vectors; q ^c (z _c | v) represents a content encoding distribution function, Q ^s (z _s | v) representation styleCoding distribution function, Q ^c (z _c | v) for obtaining a content-encoded vector, Q ^s (z _s | v) is used for obtaining the style encoding vector;

is input as z _c The content encoding discriminator of (1),

is input as z _s The style code discriminator of (1);

as an input of Q ^c (z _c A content encoding discriminator of | v),

as an input of Q ^s (z _s | v) style encoding discriminator.

6. The method of image synthesis in combination with a competing autoencoder and a competing network according to claim 5, wherein the competing network total loss function is

representing a generator global consistency loss function;

is expressed as input

And a generator of m;

wherein,

the input is m,

And

the second discriminator of (1);

representing inputs as m and

the generator of (1).

7. The method of image synthesis incorporating a confrontational autoencoder and generation of a confrontational network of claim 6, wherein said synthesis model total loss function is

Wherein,

representing the composite model total loss function.

8. An image synthesis system incorporating a challenge autocoder and a generation challenge network, comprising:

the improved conditional generation countermeasure network construction module is used for constructing an improved conditional generation countermeasure network; the improved conditional generation countermeasure network comprises a generator and a second discriminator; the generator is used for generating a reconstructed fundus retina image according to the artificial segmentation retina outline mask and the reconstructed blood vessel tree image output by the enhanced countermeasure automatic encoder; the second discriminator is used for judging the reconstructed fundus retina image and carrying out reverse adjustment training;

the model training module is used for taking the artificially segmented blood vessel tree image and the original fundus retina image as training data, and performing iterative training on the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network to obtain an optimal blood vessel tree image generator and an optimal fundus retina image generator;

the synthetic image determining module is used for carrying out fundus retina image synthesis on the to-be-processed artificially segmented blood vessel tree image based on the optimal blood vessel tree image generator and the optimal fundus retina image generator to obtain a synthetic image;

the model training module specifically comprises:

the training data acquisition unit is used for acquiring an artificial segmentation blood vessel tree image and an original fundus retina image to obtain training data; the artificially segmented blood vessel tree image corresponds to the original fundus retina image;

a fundus retina image discrimination unit configured to take the reconstructed blood vessel tree image, the reconstructed fundus retina image, the artificially segmented blood vessel tree image, and the original fundus retina image as input of a second discriminator to perform discrimination;

an optimal model determining unit, configured to perform joint training on the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network in a back propagation manner according to the total loss function of the synthetic model, so that parameters in the enhanced countermeasure automatic encoder and the improved conditional generation countermeasure network are continuously updated and optimized to obtain an optimal vessel tree image generator and an optimal fundus retina image generator; the optimal vascular tree image generator is a trained combined enhanced countermeasure automatic encoder, and the optimal fundus retina image generator is a trained improved conditional generation countermeasure network.