CN110544275A

CN110544275A - Methods, systems, and media for generating registered multi-modality MRI with lesion segmentation tags

Info

Publication number: CN110544275A
Application number: CN201910764408.9A
Authority: CN
Inventors: 瞿毅力; 王莹; 苏琬棋; 邓楚富; 卢宇彤; 陈志广
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2019-08-19
Filing date: 2019-08-19
Publication date: 2019-12-06
Anticipated expiration: 2039-08-19
Also published as: CN110544275B

Abstract

the invention discloses a method, a system and a medium for generating registered multi-mode MRI with a focus segmentation label, wherein the method comprises the steps of obtaining a normally distributed random matrix and inputting the matrix into a decoder for generating trained structural features in a countermeasure network to decode and generate a structural feature graph; the structural characteristic graph and the randomly selected focus segmentation label graph are fused through random input to obtain a fusion result, and the fusion result is input to a trained random encoder in a generated countermeasure network to obtain a code; and generating decoders for resisting each trained mode in the network by using the coded input, and respectively generating the multi-mode MRI after registration. The generator in the generation countermeasure network is modularized into the encoder and the decoder, and through the combined training of a plurality of groups of encoders, decoders and discriminators, a random input meeting the design specification can be received to generate a group of registered multi-mode MRI images with focus segmentation labels, so that the method can be widely applied to the field of medical imaging.

Description

Methods, systems, and media for generating registered multi-modality MRI with lesion segmentation tags

Technical Field

The invention relates to an image generation technology in the medical field, in particular to a method, a system and a medium for generating a multi-modality MRI with a lesion segmentation tag, which are used for acquiring a multi-modality MRI image with a lesion segmentation tag according to given random input meeting the specification.

background

with the development of deep learning, more and more fields begin to adopt deep neural networks to perform image processing tasks. However, training of deep neural networks requires a large amount of data, but in many fields, construction of data sets is very difficult. Therefore, the image generation technology has important applications in image intelligent processing scenes in many fields, such as medical images and biological cell images. In medical image intelligent processing, there are many modalities for medical images, such as Magnetic Resonance Imaging (MRI) X-ray, CT, and so on. When different modalities are obtained from the same part of the same patient by different imaging techniques, the modalities are considered to be registered if the imaging positions and viewing angles coincide. Compared with single-mode data, the registered multi-mode image data can provide more information, can support more complex application scenes, meets the requirement of training data of a deep neural network, and is beneficial to providing more efficient and reliable intelligent diagnosis service. However, medical image collection is very difficult, especially for rare diseases, making medical image data sets both scarce and small, which makes many training tasks impossible. Naturally, registered multi-modality image data is more scarce. Therefore, by applying image generation techniques, the generation of registered multimodal images has a wide range of uses and profound significance.

The generation of a countermeasure network (GAN) is a flexible deep neural network that can be subjected to unsupervised training and also supervised training, and has been widely used in the field of computer vision. The generation of a countermeasure network generally includes a generator that can generate realistic images by accepting random input and a discriminator that distinguishes between real images and generated images by learning them and thereby directs the generator to generate more realistic images. However, how to acquire a multi-modality MRI image with a lesion segmentation tag according to a given random input meeting the specification when generating the multi-modality MRI with the lesion segmentation tag, is still a key technical problem to be solved urgently.

disclosure of Invention

the technical problems to be solved by the invention are as follows: the invention discloses a method, a system and a medium for generating registered multi-modal MRI with focus segmentation labels, aiming at the problems in the prior art, the invention modularizes a generator in a generation countermeasure network into an encoder and a decoder, can receive a random input meeting design specifications through the combined training of a plurality of groups of encoders, decoders and discriminators so as to generate a set of registered multi-modal MRI images with focus segmentation labels, and can be widely applied to the field of medical images.

in order to solve the technical problems, the invention adopts the technical scheme that:

a method of generating registered multi-modality MRI with lesion segmentation tags, the implementation steps comprising:

1) Obtaining a random matrix CodeF, RM from the normal distribution N (0, 12);

2) inputting the random matrix CodeF and RM to generate a structural feature decoder DecoderF which is trained in the countermeasure network, and decoding to generate a structural feature map FRM;

3) performing random input fusion on the structural feature map FRM and the randomly selected focus segmentation label map L to obtain a fusion result;

4) inputting the fusion result into a trained random encoder EncoderRM in the generated countermeasure network to obtain a coded CodeRM;

5) And inputting the coded CodeRM into an i-modal decoder Decoderi for generating each trained modal i in the countermeasure network, and respectively generating the i-modal MRIig after registration.

Optionally, step 2) is preceded by a step of training a decoder DecoderF for generating structural features in the countermeasure network, and the detailed steps include:

A1) randomly selecting a mode, obtaining a graph n from the mode, extracting structural features to obtain a structural feature graph F, and extracting a Mask to obtain a corresponding Mask;

A2) coding the structural feature graph F by using a structural feature coder EncoderF in a generated countermeasure network to obtain a coded mean matrix CodeF, mean and variance matrices CodeF and logvar, obtaining random noise Codee from normal distribution N (0,12), and synthesizing the three codes of the mean matrix CodeF, mean and variance matrices CodeF, logvar and random noise Codee to obtain an approximate normal distribution matrix CodeF added with noise;

A3) decoding the normal distribution matrix CodeF by using a mask decoder DecoderMask to obtain a reconstructed mask Maskr; decoding CodeF by using a structural feature decoder DecoderF to obtain a reconstructed structural feature map Fr;

A4) randomly generating a matrix CodeF and RM conforming to normal distribution N (0,12), decoding the matrix CodeF and RM by using a structural feature decoder DecoderF to obtain a generated random structural feature map FRM, and decoding the matrix CodeF and RM by using a mask decoder DecoderMask to obtain a generated random mask MaskRM;

A5) identifying (F, Mask) and (FRM, MaskRM) respectively by using a structural feature map identifier DiscriminortF, wherein the structural feature map identifier DiscriminortF identifies the (F, Mask) as true and the (FRM, MaskRM) as false; wherein F is a structural feature diagram, Mask is a Mask, FRM is a random structural feature diagram, and maskrM is a random Mask; identifying CodeF and RM respectively by using a structural feature identifier FeatureDiscriminorF, and identifying the CodeF as false and the CodeF as true, wherein CodeF is an approximate normal distribution matrix after noise is added, and CodeF and RM are matrixes which are randomly generated and conform to normal distribution N (0, 12);

A6) calculating loss according to the output result of each step and the corresponding loss function, calling an optimizer to derive the loss function to obtain the gradient of the model parameter in each component, and then calculating the difference between each parameter and the corresponding gradient to complete the update of the network parameter;

A7) judging whether a preset iteration ending condition is met, wherein the iteration ending condition is that the loss function value is lower than a set threshold value or the iteration frequency reaches a set step number, and if not, skipping to execute the step A1); otherwise, exiting.

optionally, the function expression of the approximate normal distribution matrix CodeF after the noise is added is obtained by integrating in step a 2):

Code＝Code+exp(0.5*Code)*Code；

in the above formula, CodeF, mean is a mean matrix obtained by encoding the structural feature map F by the structural feature encoder EncoderF, CodeF, logvar is a variance matrix obtained by encoding the structural feature map F by the structural feature encoder EncoderF, and Codee is random noise obtained from normal distribution N (0, 12).

Optionally, the detailed steps of step 3) include:

3.1) randomly selecting a focus segmentation label graph L, converting the focus segmentation label graph L containing a plurality of categories into a one-hot matrix to obtain a multi-dimensional label matrix with the same channel number and category number, wherein only part of pixels in each channel are effective, the rest parts are filled 0, and the non-0 pixel regions are registered with each segmentation region in the focus segmentation label graph;

3.2) stacking the segmented multi-dimensional label matrix and the structural feature map FRM together in the channel dimension to obtain a matrix fusing two input sources as a fusion result.

optionally, the step 4) is preceded by a step of training an i-mode decoder Decoderi and an i-mode encoder Encoderi of each mode i, a lesion segmentation label decoder DecoderL, and a training stochastic encoder encoderm, and the detailed steps include:

B1) training an i-mode decoder Decoderi, an i-mode encoder Encoderi and a focus segmentation label decoder DecoderL of each mode i, and training a random encoder EncoderRM;

B2) calculating loss according to the output result of each training step and the corresponding loss function, calling an optimizer to conduct derivation on the loss function to obtain the gradient of the model parameter in each component, and then performing difference calculation on each parameter and the corresponding gradient to complete updating of the network parameter;

B3) judging whether a preset iteration ending condition is met, wherein the iteration ending condition is that the loss function value is lower than a set threshold value or the iteration frequency reaches a set step number, and if not, skipping to execute the step B1); otherwise, exiting.

Optionally, the detailed steps of training the i-mode decoder Decoderi, the i-mode encoder Encoderi, and the lesion segmentation label decoder DecoderL of each mode i in step B1) include:

Step 1, inputting an original image i of a random modality i;

Step 2, encoding the original image i by using an i-mode encoder Encoderi to obtain a code Codei;

Step 3, decoding the coded Codei by using an i-mode decoder Decoderi to obtain a reconstructed image ir; decoding the coded Codei by using a focus segmentation label decoder DecoderL to obtain focus segmentation label graphs Li, f; meanwhile, for any other mode j, firstly, decoding the coded Codei by using a j-mode decoder Decoderj to obtain a j-mode conversion graph jt of the original image i, then coding the j-mode conversion graph jt of the original image i by using a j-mode coder Encoderj to obtain coded Codei, t, and then decoding the coded Codei, t by using an i-mode decoder Decoderi to obtain a cyclic reconstruction graph ic of the original image i by using the j mode as an intermediate mode;

and 4, respectively identifying the original image i and the i-mode conversion image it converted from each mode j to the mode i by the mode identifier Discriminortx, wherein the former is identified as true and the latter is identified as false.

optionally, the step of training the random encoder EncoderRM in step B1) includes:

Step 1, randomly selecting a mode, obtaining a graph n and a corresponding focus segmentation label graph Ln from the mode, obtaining a structural feature graph F1 by using a structural feature extraction method, and obtaining a corresponding Mask by using a Mask extraction method; removing and extracting the focus information of the structural feature map F1 by using the focus segmentation label map Ln to obtain a structural feature map F without focus information;

step 2, randomly inputting and fusing the structural feature graph F and the randomly input focus segmentation label graph L to obtain fusion results FRM and expand;

step 3, sending the fusion results FRM and expand into a random encoder EncodeRM to be encoded into a coded CodeRM;

step 4, inputting the coded CodeRM into a focus segmentation label decoder DecoderL to decode a reconstructed focus segmentation label graph Lr; simultaneously for each modality i: generating a picture ig by an i-mode decoder Decoderi of a coded CodeRM input mode i to obtain an i-mode, extracting structural characteristics of the generated picture ig by the i-mode to obtain a structural characteristic picture Fi, g, inputting the generated picture ig by the i-mode to an i-mode encoder Encoderi to obtain a coded Codei, g, decoding the coded Codei, g into a focus segmentation label decoder Decoderl to obtain a Ly, g, and inputting a j-mode decoder Decoderj of each of the other modes j to obtain a corresponding j-mode generated focus segmentation label picture jg, t;

step 5, aiming at each mode i, respectively identifying the mode i identifier Discriminotari of each mode i to generate an image ig of the mode n and the mode i of the original image of the mode, and identifying the former as true and the latter as false; the CodeRM and the Codei of each mode i are respectively identified by a feature identifier FeatureDiscriminitor, the CodeRM is identified as false, and the Codei of each mode i is identified as true.

further, the present invention provides a system for generating registered multi-modality MRI with lesion segmentation tags, comprising:

A random matrix generation program unit for obtaining a random matrix CodeF, RM from the normal distribution N (0, 12);

the structural feature extraction program unit is used for inputting the random matrix CodeF and RM to generate a structural feature graph FRM through decoding a trained structural feature decoder DecoderF in the countermeasure network;

The structural feature fusion program unit is used for obtaining a fusion result by randomly inputting and fusing the structural feature map FRM and the randomly selected focus segmentation label map L;

The random coding program unit is used for inputting the fusion result into a random coder EncodeRMM which is trained in the countermeasure network to obtain a coded CodeRM;

And the registration structure feature map generation program unit is used for inputting the coded CodeRM into an i-mode decoder Decoderi for generating each trained mode i in the countermeasure network, and respectively generating the registered i-mode MRIig.

furthermore, the present invention also provides a system for generating a registered lesion segmentation tagged multi-modality MRI, comprising a computer device programmed or configured to perform the steps of the method for generating a registered lesion segmentation tagged multi-modality MRI, or a computer program stored on a storage medium of the computer device and programmed or configured to perform the method for generating a registered lesion segmentation tagged multi-modality MRI.

Furthermore, the invention also provides a computer readable storage medium having stored thereon a computer program programmed or configured to perform the generating of the registered multi-modality MRI with lesion segmentation tags.

compared with the prior art, the invention has the following advantages:

1. the generator in the generation countermeasure network is modularized into the encoder and the decoder, and through the combined training of a plurality of groups of encoders, decoders and discriminators, a random input meeting the design specification can be received to generate a group of registered multi-mode MRI images with focus segmentation labels, so that the method can be widely applied to the field of medical imaging.

2. the data used by the training of the invention does not need to be registered, is unsupervised learning, can realize the generation of multi-mode registration MRI, and the generated data is labeled, thus having no limit on the number of modes.

3. the invention adopts the modularized design, can conveniently carry out the mode expansion, leads the model training to be more flexible, leads the training to be carried out independently or synchronously, and leads the trained modules to be combined and reused when in use.

Drawings

FIG. 1 is a schematic diagram of the basic principle of the method according to the embodiment of the present invention.

FIG. 2 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.

FIG. 3 is a diagram of a training architecture generated from a structural feature diagram according to an embodiment of the present invention.

FIG. 4 is an auxiliary training architecture diagram in the modal registration image generation training of an embodiment of the present invention

FIG. 5 is a diagram of a generation training architecture in a modal registration image generation training of an embodiment of the invention

Fig. 6 is a main flow chart of a modality registration image generation training according to an embodiment of the present invention.

Detailed Description

the method, system and medium for generating registered lesion segmentation tagged multi-modality MRI of the present invention will be described in further detail below, using x and y modalities as an example. However, it should be noted that the method, system and medium for generating registered multi-modality MRI with lesion segmentation tags according to the present invention are not limited to two-modality registration image generation, but can be applied to three-modality and more registered multi-modality MRI generation.

as shown in fig. 1 and 2, the implementation steps of the method for generating registered multi-modality MRI with lesion segmentation tags of the present embodiment include:

1) a random matrix CodeF, RM, which can be expressed as N (0,12) → CodeF, RM, is obtained from the normal distribution N (0, 12);

2) inputting the random matrix CodeF and RM to generate a structural feature decoder DecoderF (DCF for short) which is trained in the countermeasure network, and decoding to generate a structural feature map FRM;

3) performing random input fusion on the structural feature map FRM and the randomly selected focus segmentation label map L to obtain a fusion result; referring to fig. 2, the randomly selected lesion segmentation label map L is obtained by randomly performing simple image transformation operations such as translation, flipping, rotation, scaling and the like on a real lesion label map;

4) Inputting the fusion result into a trained random encoder EncoderRM (ECR for short) in the generated countermeasure network to obtain a coded CodeRM;

5) And inputting the coded CodeRM into an i-modal decoder Decoderi for generating each trained modal i in the countermeasure network, and respectively generating the i-modal MRIig after registration. Referring to fig. 1, taking two modes x and y as an example, inputting the coded CodeRM to generate an x-mode decoder Decoderx (DCx for short) of the trained mode x in the countermeasure network, and respectively generating a registered i-mode generated structural feature map xg; and inputting the coded CodeRM to generate a y-mode decoder Decodery (DCy for short) of the trained mode y in the countermeasure network, and respectively generating a registered i-mode generation structure feature map yg.

the present embodiment has the following random input specifications for the method of generating registered lesion segmentation tagged multi-modality MRI: the random input comprises a random structural feature map and a random segmentation label map. The random structural feature map is generated by decoding from a standard normal distribution matrix by a decoder. The real structural feature map is extracted from a real image by a structural feature map extraction method based on a Sobel operator. The random segmentation label map is a segmentation label of each part in a randomly selected real mode image. The present embodiment is capable of receiving random input that meets this specification, generating a set of segmentation tagged, registered multi-modality MRI data.

in this embodiment, a step of training to generate a structural feature decoder DecoderF (DCF for short) in the countermeasure network is further included before step 2), a part of the step of training to generate the structural feature decoder DecoderF (DCF for short) in the countermeasure network may be trained independently, after the training is completed, the generated random structural feature map is used to further generate a set of MRI multi-modality data with registration of a lesion segmentation label, in this embodiment, the structural feature map is encoded into a normal distribution space based on an idea of a Variation Automatic Encoder (VAE), and then the decoder decodes the structural feature map from the normal distribution space. As shown in fig. 3, the detailed steps of training the decoder DecoderF (DCF for short) in the generated countermeasure network in this embodiment include:

A1) randomly selecting a mode, obtaining a graph n from the mode, performing structural feature extraction (in the embodiment, a Sobel operator is adopted, so that the Sobel operator is used for representing and extracting structural features in the graph) to obtain a structural feature graph F, and performing Mask extraction (represented by Mask in the graph) to obtain a corresponding Mask;

A2) Coding the structural feature graph F by using a structural feature coder EncodeRF in a generated countermeasure network to obtain a coded mean matrix CodeF, mean and variance matrices CodeF and logvar, which can be expressed as CodeF, mean, CodeF and logvar ═ EncodeF (F), obtaining random noise CodeE from normal distribution N (0,12), which can be expressed as N (0,12) → CodeE, and synthesizing the three codes of the mean matrix CodeF, mean and variance matrices CodeF, 1ogvar and random noise CodeF to obtain an approximate normal distribution matrix CodeF added with noise; as the number of training iterations increases, CodeF will get closer to the standard normal distribution.

A3) Decoding the approximate normal distribution matrix CodeF by using a mask decoder DecoderMask to obtain a reconstructed mask Maskr, which can be expressed as Maskr ═ DecoderMask (CodeF); decoding the CodeF by using a structural feature decoder DecoderF to obtain a reconstructed structural feature map Fr, which can be expressed as Fr ═ DecoderF (CodeF);

A4) Randomly generating a matrix CodeF and RM conforming to normal distribution N (0,12), and decoding the matrix CodeF and RM by using a structural feature decoder DecoderF (DCF for short) to obtain a generated random structural feature map FRM, which can be expressed as FRM ═ DecoderF (CodeF, RM); decoding the matrix CodeF and RM by using a mask decoder DecoderMask (DCMask for short) to obtain a generated random mask MaskRM, which can be expressed as a maskrrm ═ DecoderMask (CodeF, RM);

A5) Identifying (F, Mask) and (FRM, MaskRM) respectively by using a structural feature map identifier (DF for short), wherein the structural feature map identifier identifies the (F, Mask) as true and the (FRM, MaskRM) as false; wherein F is a structural feature diagram, Mask is a Mask, FRM is a random structural feature diagram, and maskrM is a random Mask; identifying CodeF and RM respectively by using a structural feature identifier FeatureDiscriminorF (FDF for short), wherein the CodeF is an approximate normal distribution matrix added with noise, and the CodeF and RM are matrices randomly generated to accord with normal distribution N (0, 12);

in the training process of the foregoing steps a1) -a 7), it is desirable that the structural feature map decoded by the random normal distribution matrix is more realistic, so that the structural feature and the mask extracted from the original image and the structural feature and the mask decoded by the random normal distribution matrix are counterlearnt by the structural feature map discriminator f, and a generation counternetwork GAN of the coding feature is added: the real structural feature map coding result CodeF and the matrix CodeF, RM which obeys the standard normal distribution are subjected to antagonistic learning through the structural feature discriminator featurediscriminant F, so that the encoder EncoderF can code the structural feature map F into a result which obeys the standard normal distribution.

in this embodiment, the corresponding loss functions in step a6) include the following four loss functions (1) to (4):

(1) make the structural feature map code obey the normally distributed antagonism loss:

In the above formula, lossfeaturediscriminantor, 1 represents the loss of class training of the feature discriminator, featurediscriminantrf (CodeF, RM) represents the discrimination output of the feature discriminator with CodeF and RM as inputs, featurediscriminantrf (CodeF) represents the discrimination output of the feature discriminator with CodeF as input, ω i is the weight of each loss (the same below), 0 and 1 represent the truth or not, losgenerator, 1 represents the resistance loss provided by the feature discriminator to the structural feature map encoder.

(2) the intermediate coding result of the structural feature graph obeys the normally distributed self-supervision loss:

in the above formula, losssSupervision, 1 represents the supervised loss for constraining CodeF to obey normal distribution, mean (CodeF, mean) represents the CodeF, mean average, mean (CodeF, logvar) represents the CodeF, logvar average.

(3) so that the structural characteristic graph decoded by the random normal distribution matrix is more vivid in adversity loss:

in the above formula, losssDiscrimidator, 1 represents the loss of classification training of the structural feature map discriminator,

DiscriminorF (F, Mask) represents the true and false identification output of the structural feature diagram identifier with F, Mask as input,

Discriminantrf (FRM, MaskRM) represents the true and false identification output of the structural feature map identifier with FRM, MaskRM as input, and losssgenerator, 2, represents the resistance loss provided by the structural feature map identifier to the structural feature map and mask generation components.

(4) And (3) performing pairwise self-supervision consistency loss on the fused structural feature graph and mask with the original structural feature graph and mask:

in the above formula, lossssupervision, 2 represents a reconstruction self-supervision loss of the structural feature map and the Mask and a consistency loss of the structural feature map and the Mask, F represents an input structural feature map, Fr represents a reconstructed structural feature map, Mask represents an input Mask, Maskr represents a reconstructed Mask, FRM represents a structural feature map decoded from a randomly generated normal distribution matrix, and MaskRM represents a Mask decoded from a randomly generated normal distribution matrix.

thus, the loss of the generator component is:

loss＝loss+loss+loss+loss

in the above formula, losssGenerator, F is the total loss of the generated component, losssGenerator, 1 is the resistance loss provided by the feature identifier to the structural feature map encoder, losssSupervision, 1 represents the supervised loss that constrains CodeF to obey normal distribution, losssGenerator, 2 represents the resistance loss provided by the structural feature map identifier to the structural feature map and the mask generation component, and losssSupervision, 2 represents the reconstruction self-supervision loss of the structural feature map and the mask and the consistency loss of the structural feature map and the mask.

the loss of the discriminator component is:

loss＝loss+loss

In the above formula, losssDiscrimidator, F is the total loss of the discriminator component, losssFeatureDiscrimidator, 1 represents the loss of the classification training of the feature discriminator, and losssDiscrimidator, 1 represents the loss of the classification training of the structural feature map discriminator-the discriminator component and the generator component are each updated individually. In this embodiment, the corresponding loss function in step a6) may also adopt other loss functions as needed.

It is well known that medical images generated directly from random noise by generating a counter-training are often difficult to train and to generate true structural information. In this embodiment, the image providing the basic contour structure information in the medical image is referred to as its structure feature map, for example, the retinal blood vessel distribution map can be regarded as the structure feature map of the retinal image. The structural feature map can provide necessary guiding information for the synthesis of medical images, for example, some studies obtain basic structural information from brain segmentation tag maps when synthesizing brain MRI images. However, the conventional structural feature maps such as the retinal vessel distribution map and the brain segmentation label map require additional data and training to extract the structural feature map from the original image. Therefore, the method for directly extracting the structural feature map is designed in the embodiment, and the method has the advantages of being fast in operation, free of training, free of additional data and the like. In the field of traditional digital image processing, people adopt Roberts operators, Prewitt operators, Sobel operators and the like which are all excellent edge detection operators. The Prewitt operator and Sobel operator are both templates of 3x3, and the resulting partial derivatives approximation is more accurate than that obtained with the templates of Roberts operator 2x 2. Compared with the Prewitt operator, the Sobel operator weights the influence of the position of the pixel, so that noise can be better suppressed, the edge blurring degree is reduced, and the effect is better. The Sobel operator is commonly used for processing medical images, and two gray level images are output after processing.

The detailed steps of extracting the structural features in step a1) in this embodiment are as follows: (1) inputting a real image n, wherein beta is a set pixel threshold; (2) respectively carrying out bitwise maximum value reduce _ max () on two outputs of the sobel operator to obtain f2 and bitwise minimum value reduce _ min () to obtain f 1; (3) difference values (mean (f1) -f1, f2-mean (f2) are obtained by respectively solving f1 and f2 and the mean pixel, and mean is a mean function), so that a low pixel edge feature map and a high pixel edge feature map are obtained; (4) and (3) performing binarization processing on the low pixel edge feature map and the high pixel edge feature map by using a pixel threshold value beta, and finally adding the two binary maps and performing binarization with a threshold value of 0 to obtain a clear structural feature map f. In this embodiment, the method for extracting the structural feature in step a1) is expressed as F ═ get _ F (n) using a function.

the detailed steps of mask extraction in step a1) in this embodiment are as follows: (1) firstly, carrying out binarization processing on an image n with a pixel threshold value of 0 to obtain a standard mask; (2) according to the size of the obtained input image, expanding p pixels of the original image size of the mask by adopting the nearest neighbor difference value, and grinding and amplifying the whole image; (3) and cutting the length and width of p pixels on the outermost layer to keep the size of the final output mask consistent with that of the original input image. In this embodiment, the method for performing Mask extraction in step a1) is expressed as Mask _ Mask (n) using a function.

In this embodiment, the function expression of the approximate normal distribution matrix CodeF after the noise is added obtained by the synthesis in step a2) is as follows:

Code＝Code+exp(0.5*Code)*Code；

In this embodiment, the detailed steps of step 3) include:

3.1) randomly selecting a focus segmentation label graph L, converting the focus segmentation label graph L containing 5 categories into a one-hot matrix to obtain a multi-dimensional label matrix with the same channel number and category number, wherein only part of pixels in each channel are effective, the rest part of the channels are filled with 0, and the non-0 pixel regions are registered with each segmentation region in the focus segmentation label graph;

In this embodiment, step 4) further includes a step of training an i-mode decoder Decoderi and an i-mode encoder Encoderi of each mode i, a lesion segmentation label decoder DecoderL, and a training stochastic encoder encoderm, and the detailed steps include:

Referring to fig. 4, the i-mode decoder Decoderi, the i-mode encoder Encoderi, and the lesion segmentation label decoder DecoderL of each mode i) are trained as supplementary training in step B1), and the detailed steps of training the i-mode decoder Decoderi, the i-mode encoder Encoderi, and the lesion segmentation label decoder DecoderL of each mode i) in step B1) include:

step 1, inputting an original image i of a random modality i;

Step 2, encoding the original image i by using an i-mode encoder encoderei to obtain a code Codei, in the embodiment, Codex ═ encoerx (x), Codey ═ encodery (y);

Step 3, decoding the coded Codei by using an i-mode decoder Decoderi to obtain a reconstructed image ir, wherein in the embodiment, xr is decoderx (codex), and yr is decodery (codey); decoding the coded Codei by using a focus segmentation label decoder Decoderl to obtain focus segmentation label maps Li and f, wherein Lx and f are Decoderl (codex), Ly and f are Decoderl (codey); meanwhile, for any other mode j, decoding the coded Codei by using a j-mode decoder Decoderj to obtain a j-mode conversion graph jt of the original image i, wherein yt is decodery (codec), and xt is decoderx (codec); coding a j-modal conversion graph jt of the original image i by using a j-modal encoder Encoderj to obtain a code Codej, t, which is Codex, t ═ encoderx (xt), Codey, t ═ encodery (yt) in the embodiment, and decoding the code Codej, t by using an i-modal decoder Decoderi to obtain a j-modal cyclic conversion graph ic of the original image i, which is xc ═ Decoderx (Codey, t) in the embodiment, and yc ═ Decodery (Codex, t);

as shown in fig. 4, in the specific training process of generating an image of X, Y registered modalities from a random input meeting the specification, the detailed steps of training the x-modality decoder Decoderx, the x-modality encoder Encoderx, and the lesion segmentation label decoder DecoderL of modality x in the present embodiment include:

s1, inputting random X-mode image X;

S2, encoding the original image x by using an x modal encoder Encoderx to obtain Codex, and decoding the Codex by using a decoder Decoderx to obtain a reconstructed image xr;

S3, decoding the Codex by using a focus segmentation label decoder DecoderL to obtain a segmentation label graph Lx, f;

S4, decoding the Codex by using a y-mode decoder Decodery to obtain a conversion graph yt;

s5, coding the conversion graph yt by using a y modal coder Encode to obtain Codey, t, and decoding the Codey, t by using an x modal decoder Decode to obtain a cyclic conversion graph xc;

S6, the mode discriminator discriminators x and xt separately, and discriminates the former as true and the latter as false.

the training process of the mode Y is the same as that of the mode Y, and the main purpose of the synchronous auxiliary training is to assist in training Encodex, Encodey, Decodex, Decodey and Decoderl modules, so that the generation training of training the stochastic encoder Encoderm in the step B1) is easier to learn.

Referring to fig. 5 and 6, the detailed step of training the random encoder EncoderRM in step B1) includes:

step 3, sending the fusion results FRM and expand into a random encoder EncodeRM (ECRM for short) to be encoded into a coded CodeRM;

Step 4, inputting the coded CodeRM into a focus segmentation label decoder DecoderL (DCL for short) to decode a reconstructed focus segmentation label graph Lr; simultaneously for each modality i: generating a picture ig by an i-mode decoder Decoderi (DCi for short) of a coded CodeRM input mode i to obtain an i-mode generation picture ig, performing structural feature extraction on the picture ig generated by the i-mode generation picture ig to obtain a structural feature map Fi, g, inputting the picture ig generated by the i-mode into an i-mode encoder Encoderi (ECi for short) to obtain a coded Codei, g, inputting the coded Codei, g into a focus segmentation label decoder Decoderl (DCL for short) to decode Ly, g, and inputting a j-mode decoder Decoderj (DCj for short) of other modes j to obtain a corresponding j-mode generation focus segmentation label map jg, t;

Step 5, aiming at each mode i, respectively identifying the mode i identifier Discriminotari (Di for short) of each mode i to generate an image ig of the mode n and the mode i of the original image of the mode, and identifying the former as true and the latter as false; the CodeRM and the Codei of each mode i are respectively identified by a feature identifier FeatureDiscrimentor (FD), the CodeRM is identified as false, and the Codei of each mode i is identified as true.

as shown in fig. 5 and fig. 6, in this embodiment, a specific generation training process of X, Y modality images and segmentation labels registered by inputting the random structure feature map and the random segmentation labels L is as follows:

S1, randomly selecting one mode at a time, obtaining a graph n and a segmentation label graph Ln corresponding to the graph n from the mode, obtaining a structural feature graph F1 by using a structural feature extraction method, which is denoted as F1 ═ get _ F (n) in this embodiment, and obtaining a corresponding Mask by using a Mask extraction method, which is denoted as Mask ═ get _ Mask (n) in this embodiment;

S2, removing and extracting the lesion information of the structural feature map F1 by using the segmentation label map Ln to obtain a structural feature map F without lesion information, which is denoted as F-remove-L (Ln, F1) in this embodiment, where remove () is a simple function that binarizes the segmentation label into a mask and then multiplies the mask with the structural feature map to eliminate partial pixels of the lesion;

S3, fusing the structural feature graph F and the randomly input segmentation label graph L to obtain FRM and expand, which are represented as FRM in this embodiment, where expand is concat (onehot (L), and F), where concat () is a channel splicing function and onehot () is the aforementioned unique heat vector extension method;

s4, sending FRM, expand to a random encoder EncoderRM, which is denoted as EncoderRM (FRM, expand) in this embodiment;

S5, sending the CodeRM to X-mode decoder Decoderx, Y-mode decoder Decodery, and split label graph decoder DecoderL, respectively, and decoding X-mode generated graph xg, Y-mode generated graph yg, and reconstructed split label graph Lr, which are expressed as xg ═ Decoderx (CodeRM), yg ═ Decodery (CodeRM), and Lr ═ DecoderL (CodeRM);

s6, extracting feature maps Fx, g, Fy, and g from the X-modality generating map xg and the Y-modality generating map yg, where Fx, g ═ get _ F (xg), Fy, g ═ get _ F (yg), and get _ F () is a structural feature map extracting method based on sobel operator;

S7, encoding the generated graph xg in the X modality by using an X modality encoder Encoderx to obtain Codex, g, which is expressed as Codex, g ═ Encoderx (xg) in this embodiment, Codey, g ═ encodery (yg);

s8, decoding the Codex, g by a decoder DecoderL to obtain Lx, g, which is denoted as Lx, g ═ DecoderL (Codex, g), Ly, g ═ DecoderL (Codey, g) in this embodiment;

s9, decoding Codex, g by a decoder Decodery to obtain yg, t, which is expressed as yg, t ═ Decodery (Codex, g), xg, t ═ Decoderx (Codex, g) in the present embodiment;

s10, similarly, the same operation is carried out on the Y mode, and the encoder Encode is used for encoding the generated graph yg to obtain code, g;

s11, sending the signal to a decoder DecoderL for decoding to obtain Ly, g;

s12, Codey, g is sent to decoder Decoderx to get xg, t.

s13, the mode X discriminator discriminators X and xg respectively, and identifies the former as true and the latter as false. The modality Y discriminator discriminates Y and yg separately, and discriminates the former as true and the latter as false. The feature discriminator identifies CodeRM, Codex and Codey respectively, and identifies CodeRM as false and then identifies CodeRM as true. In the above image generation training process, since the structural feature map F1 is extracted from the random image n, and the extracted structural feature may contain lesion structural information, which may interfere with the lesion information in the random label L and affect the image generated after fusion, F1 needs to eliminate the lesion information before fusion with the random label L to obtain the structural feature map F without lesion information, so that the lesion information of the generated image is derived from the label L only.

In this embodiment, the loss functions of the multimodal map generation training and the auxiliary training process include the following loss functions (5) to (15):

(5): the antagonism loss that makes the X, Y modal graph generated by the random structural feature graph more realistic:

in the above formula, losssDiscrimator, 2 represents the loss of training in classification of x-mode of the mode discriminator, Discrimatorx (x) represents the output of the mode discriminator with x as input, lossGenerator, 3 represents the loss of antagonism of x-mode provided to the generating component by the mode discriminator, lossDiscrimator, 3 represents the loss of training in classification of y-mode of the mode discriminator, Discrimatory (y) represents the output of the mode discriminator with y as input, lossGenerator, 4 represents the loss of antagonism of y-mode provided to the generating component by the mode discriminator.

(6): the encoding result of the random structure characteristic diagram is more approximate to the antagonism loss of the encoding result of the real mode diagram (so as to reduce the decoding difficulty of the decoder and ensure that the decoder can successfully decode the mode diagram).

In the above formula, lossfeaturediscriminator, 2 represents the classification training loss of the feature discriminator, losssgenerator, 5 represents the countermeasure loss provided by the feature discriminator to the structural feature map encoder to instruct the encoder to encode the input fusion map into an encoding result similar to the real mode map encoding result, and featurediscriminator (CodeRM) represents the true and false discrimination output of the feature discriminator with CodeRM as the input.

(7) Reconstruction of the input structural feature map self-supervision loss:

in the above formula, losssupervision, 3 represents the reconstruction of the structural feature map and the unsupervised loss, removeL (L, F) represents the structural feature map output of the lesion-free information obtained by removeL () on L and F, removeL (L, Fx, g) represents the structural feature map output of the lesion-free information obtained by removeL () on L and Fx, g, wherein removeL () is a simple function of binarizing the segmentation label into a mask and then integrating the mask with the structural feature map to eliminate the lesion part pixels;

(8) and (3) reconstructing the input focus segmentation label graph after being fused with the input structural feature graph, wherein the reconstruction loss is as follows:

in the above formula, losssupervision, 4 represents the reconstruction self-monitoring loss of the lesion segmentation label map, L is the input label, Lx, g are the generation labels of the x modality, and Lr is the label of direct reconstruction.

(9) X, Y supervised loss of modal graph segmentation training:

in the above formula, losssupervision, 5 is the supervised loss of X, Y mode graph segmentation training, Lx is the true label of x mode, and Lx, f are the generation labels of x mode.

(10) converting the generated X-mode and Y-mode graphs to obtain a conversion graph and generating the self-supervision loss of the graph:

in the above formula, losssupervision, 6 is the self-supervision loss between the generated conversion graph obtained by converting the generated X-mode and Y-mode graphs and the generated graph, xg is the generated X-mode MRI, xg, t is the generated Y-mode MRI and then converted into X-mode MRI, the registration constraint of the generated multi-mode MRI is realized through the loss, and when the loss is expanded to more modes, the generated graph of each mode and the conversion graph of the mode converted from the generation graph of different modes are converted to obtain the mean square error loss.

(11) supervision loss limiting the pixel generation range to the range of the organ body mask:

In the above formula, losssupervision, 7 represents a supervision loss that limits the pixel generation range to the range of the Mask, Mask is a binary Mask map, and the organ range of the organ described by the Mask map is 0 value and the range is 1 value.

(12) the self-supervision loss of the original image and a reconstructed image obtained by reconstructing the X-mode image and the Y-mode image is as follows:

In the above equation, losssupervision, 8 represents the self-supervision loss of the original image and the reconstructed image obtained by reconstructing the X-mode image and the Y-mode image.

(13) and (3) a loop conversion graph obtained by converting the X-mode graph and the Y-mode graph and an original graph have supervision loss:

in the above formula, losssurevisin, c represents the supervised loss between the original and the cyclic conversion map obtained by converting the X-mode and Y-mode maps, xc is the cyclic reconstruction map of the X-mode MRI after the X-mode MRI is converted into the Y-mode MRI, and then the X-mode MRI is converted back to the X-mode MRI.

(14) generating the self-supervision semantic consistency loss of the codes of the X mode and the Y mode graph and the codes obtained by the X mode and the Y mode graph through the encoder by the decoder:

In the above formula, lossssupervisory, 9 represents the encoding of the X-mode and Y-mode graphs generated by the decoder and the encoding of the X-mode and Y-mode graphs obtained by the encoder, and Codex, g is the encoding result obtained by recoding the generated graphs of the X-mode.

(15) self-supervised semantic consistency loss of X-modality and Y-modality graph coding:

in the above formula, losssSupervision, 10 represents the self-supervision semantic consistency loss of the real X-mode and Y-mode graph coding, Codex represents the real MRI coding result of the X-mode, Codey, t represents the recoding result after the real MRI of the X-mode is converted into the Y-mode MRI, and when the mode is expanded to more modes, the mean square error loss is calculated by the original image coding result and the recoding result of different intermediate modes.

The generating component in the multi-modal graph generating training and auxiliary conversion training process is a common component, and synchronously trains and updates.

in the multi-modal graph generation training and auxiliary training process of the embodiment, the total loss of the discriminator component is as follows:

loss＝∑loss+∑loss

in the above equation, losssdiscriminator represents the total loss of the discriminator components, Σ i ═ 2 losssdiscriminator, i represents the total loss of the mode discriminator, and Σ i ═ 2 losssfeaturediscriminator, i represents the total loss of the feature discriminator.

The total loss of the resulting component is:

loss＝∑loss+∑loss

In the above formula, losssGenerator represents the total loss of the generator components, Σ i ═ 3 losssGenerator, i represents the total antagonism loss provided by the discriminator, and Σ i ═ 3 losssSummision, i represents the total lesion signature with supervision loss and each item of self-supervision loss.

The above loss functions are not illustrated by mode expansion, and can directly increase the loss terms of the corresponding modes according to the number of the generated and converted modes, and are not limited to the two modes x and y in this embodiment.

For the training of a plurality of modes, only the mode registration image generation training method is needed to supplement the mode conversion training.

When training n (n is more than 1) modes, the auxiliary training needs to carry out training of n modes, each mode comprises n-1 steps S4-S5, and the training is respectively corresponding to the cycle conversion training of the current mode and other n-1 modes; in the generating training step S5, a label graph and a generating graph of n modalities are generated, each modality includes n-1 steps S9, and the conversion training is respectively corresponding to the current modality and other n-1 modalities.

the generation countermeasure network of this embodiment has an encoder for receiving random input, a decoder for decoding an input segmentation label map from the encoded result, and a feature discriminator for discriminating the encoded result from true or false, and then, for each mode, there are an encoder, a decoder, and a mode discriminator. In addition, the embodiment includes a feature extraction module for extracting structural features from a real image, an encoder for encoding the structural feature map into a normal distribution, a decoder for decoding the structural feature map from the normal distribution, a true and false structural feature map discriminator for guiding the reconstruction of the structural feature map, and a structural feature discriminator for performing true and false discrimination on the encoding result. After the training of the steps A) -A7) and the steps B1) -B3) is completed, only part of module components in the steps A) -A7) and the steps B1) -B3) need to be recombined, and then a large number of multi-modal registration images can be generated conveniently and quickly. Obtaining random matrixes CodeF and RM from normal distribution N (0,12), decoding by using a trained structural feature decoder DecoderF to generate a structural feature map FRM, fusing the FRM with a randomly selected segmentation label map L, inputting a fusion result into a trained random input encoder EncoderRM to obtain a coded CodeRM, and finally decoding the CodeRM by using trained decoders Decoderx and Decodery of different modes to generate a registered X mode map xg and a registered Y mode map yg.

compared with the prior art, the method for generating the registered multi-modality MRI with the lesion segmentation tag in the embodiment has the following advantages: 1) data used for training is not required to be registered, unsupervised learning is adopted, multi-mode registration MRI images can be generated, generated data are labeled, and the number of modes is not limited. 2) The modularized design can conveniently carry out mode expansion, and make model training more flexible, training can be independently carried out or can be carried out synchronously, and the trained modules are combined and reused when in use. 3) The method for extracting the structural features is improved based on the traditional Sobel operator method, the Sobel operator is used for extracting the features and then further processing is carried out, a maximum value graph and a minimum value graph are obtained, and finally, the structural feature graphs are obtained through fusion, and enough structural information is reserved.

in each formula of this embodiment, x, y, xr, yr, xt, yt, xc, yc, xg, yg, xg, t, yg, t respectively correspond to X, Y original images, reconstructed images, conversion maps, cyclic conversion maps, generated maps and generated conversion maps of two modalities, Encoderx, Encodery, Decoderx, Decodery respectively represent an encoder and a decoder of the modality X, Y, and Codex, Codey, Codex, t, Codey, t, Codex, g, Codey, g represent feature results obtained after x, y, xt, yt, xg, yg are respectively encoded by corresponding encoders Encoderx, Encodery. F. Fx, g, Fy, g represent a structural feature map extracted from an arbitrary mode map n, xg, yg by a structural feature extraction method get _ F, an X mode generated structural feature map, and a Y mode generated structural feature map, Mask represents a Mask extracted by a Mask extraction method get _ Mask, DecoderF, DecoderMask, DecoderL, and encoderm represent a structural feature decoder, a Mask decoder, a division tag decoder, and a random input encoder, respectively. CodeF, mean, CodeF, logvar represents the feature result obtained by encoding F with an encoder EncoderF, Codee, CodeF, RM represents the matrix randomly obtained from normal distribution N (0,12), CodeF represents the normal distribution matrix after adding noise, Fr, FRM represent the reconstruction feature map and the random feature map obtained by decoding CodeF, RM with DecoderF, Maskr represent the reconstruction mask and the random mask obtained by decoding CodeF, RM with DecoderMask, FRM, expand represent the feature label map obtained by fusing FRM and the segmentation label map L, CodeRM represents the feature result obtained by encoding FRM, expand with EncoderF, Lx, F, Lr, Lx, lg, Ly, g represent the reconstruction label map obtained by decoding Codex, CodeRM, Codex, cozy, Codex, and decodery with decoderx, Codex label map, and decoderx label map, And the X mode generates a segmentation label map, and the Y mode generates a segmentation label map. In addition, discriminantrf mentioned in the above training method is a structural feature map discriminator, featurediscriminantrf is a structural feature discriminator, discriminantx, discriminant denotes the discriminator of the modality X, Y, and featurediscriminantor is a feature discriminator common to a plurality of modalities.

Further, the present embodiments also provide a system for generating registered multi-modality MRI with lesion segmentation tags, comprising:

furthermore, the present embodiment also provides a system for generating registered lesion segmentation tagged multi-modality MRI, comprising a computer device programmed or configured to execute the steps of the aforementioned method for generating registered lesion segmentation tagged multi-modality MRI, or a storage medium of the computer device having stored thereon a computer program programmed or configured to execute the aforementioned method for generating registered lesion segmentation tagged multi-modality MRI.

furthermore, the present embodiments also provide a computer readable storage medium having stored thereon a computer program programmed or configured to perform the aforementioned method of generating registered lesion segmentation tagged multi-modality MRI.

the above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims

1. A method of generating registered multi-modality MRI with lesion segmentation tags, characterized by the implementation steps comprising:

1) obtaining a random matrix CodeF, RM from the normal distribution N (0, 12);

5) and inputting the coded CodeRM to generate i-mode decoders Decoderi of each trained mode i in the countermeasure network, and respectively generating i-mode MRI ig after registration.

2. The method of generating registered lesion segmentation tagged multi-modality MRI of claim 1 further comprising the step of training a structure feature decoder DecoderF in a generation countermeasure network prior to step 2), the detailed steps comprising:

A2) Coding the structural feature graph F by using a structural feature coder EncoderF in a generated countermeasure network to obtain a coded mean matrix CodeF, mean and variance matrices CodeF and logvar, obtaining random noise Codee from normal distribution N (0,12), and synthesizing the three codes of the mean matrix CodeF, mean and variance matrices CodeF, logvar and random noise Codee to obtain a normal distribution matrix CodeF added with noise;

A5) identifying (F, Mask) and (FRM, MaskRM) respectively by using a structural feature map identifier DiscriminortF, wherein the structural feature map identifier DiscriminortF identifies the (F, Mask) as true and the (FRM, MaskRM) as false; wherein F is a structural feature diagram, Mask is a Mask, FRM is a random structural feature diagram, and maskrM is a random Mask; identifying CodeF and RM respectively by using a structural feature identifier FeatureDiscriminorF, and identifying the CodeF as false and the CodeF as true, wherein CodeF is a normal distribution matrix after noise is added, and CodeF and RM are matrixes randomly generated to be in accordance with normal distribution N (0, 12);

3. the method of generating registered lesion segmentation tagged multi-modality MRI as set forth in claim 2, wherein step a2) combines to obtain a functional expression of the noise-added approximately normal distribution matrix CodeF as:

Code＝Code+exp(0.5*Code)*Code；

4. the method of generating registered lesion segmentation tagged multi-modality MRI as set forth in claim 1, wherein the detailed steps of step 3) include:

5. the method of generating registered lesion segmentation tagged multi-modality MRI of claim 1 further comprising the step of training an i-modality decoder Decoderi, an i-modality encoder Encoderi and a lesion segmentation tag decoder DecoderL and a training stochastic encoder encoderm for each modality i prior to step 4), the detailed steps comprising:

6. The method of generating registered lesion segmentation tagged multi-modality MRI of claim 5 wherein the detailed steps of training the i-modality decoder Decoderi, the i-modality encoder Encoderi, and the lesion segmentation tag decoder DecoderL of each modality i in step B1) include:

step 1, inputting an original image i of a random modality i;

7. The method of generating registered lesion segmentation tagged multi-modality MRI of claim 5 wherein the training of the random encoder EncoderRM detailed step in step B1) comprises:

8. a system for generating registered multi-modality MRI with lesion segmentation tags, comprising:

9. A system for generating registered lesion segmentation tagged multi-modality MRI comprising a computer device, characterized in that the computer device is programmed or configured to perform the steps of the method for generating registered lesion segmentation tagged multi-modality MRI of any one of claims 1 to 7, or a storage medium of the computer device has stored thereon a computer program programmed or configured to perform the method for generating registered lesion segmentation tagged multi-modality MRI of any one of claims 1 to 7.

10. a computer readable storage medium having stored thereon a computer program programmed or configured to perform the method of generating registered lesion segmentation tagged multi-modality MRI of any one of claims 1-7.