CN110689561A

CN110689561A - Conversion method, system and medium of multi-modal MRI and multi-modal CT based on modular GAN

Info

Publication number: CN110689561A
Application number: CN201910880585.3A
Authority: CN
Inventors: 瞿毅力; 苏琬棋; 邓楚富; 王莹; 卢宇彤; 陈志广; 肖侬
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2019-09-18
Filing date: 2019-09-18
Publication date: 2020-01-14
Anticipated expiration: 2039-09-18
Also published as: CN110689561B

Abstract

The invention discloses a conversion method, a system and a medium of multi-modal MRI and multi-modal CT based on modular GAN, wherein the conversion method comprises the steps of selecting a trained module in a GAN network to perform CT image-CT image mode conversion, CT image-MRI image mode conversion, MRI image-CT image mode conversion, CT image-MRI focus task conversion and MRI image-CT focus task conversion according to the type of a task to be executed. The invention provides a conversion method adopting modularized conditional GAN in consideration of the condition that the states of internal submodels of MRI and CT are very similar but the two modes of MRI and CT are greatly different.

Description

Conversion method, system and medium of multi-modal MRI and multi-modal CT based on modular GAN

Technical Field

The invention relates to the field of medical image processing, in particular to a conversion method, a system and a medium of multi-modal MRI and multi-modal CT based on modular GAN, which can be used for generating a counternetwork to generate a registered multi-modal MRI and CT image through a condition according to an MRI or CT image and a target mode of a given mode.

Background

There are many modalities for medical imaging, such as Magnetic Resonance Imaging (MRI), ultrasound, CT, and so forth. MRI can be subdivided into sub-modalities with different contrasts, such as T1, T2, T1w and T2w, and CT can also obtain different sub-modality images according to different irradiation doses. Compared with single-mode data, the registered multi-mode image data can provide more information. However, the acquisition of registered multi-modality medical images is costly. Therefore, the method has wide application and profound significance by expanding the data set by applying the image synthesis technology and converting the existing single-mode image into the registered multi-mode image. Some studies employ full convolution neural networks (FCNs) or generation of countermeasure networks (GANs) for the transformation of medical images. FCN requires registered multimodal datasets for supervised learning, which imposes significant limitations. The GAN can implement unsupervised learning, and generally comprises a generator and a discriminator, wherein the generator implements modality conversion generation, and the discriminator provides the generator with an antagonism loss to guide the generated image to be more real. When the GAN is used for multi-modal medical image conversion, one idea is to train a plurality of GANs, each GAN is in charge of one conversion task, and the other idea is to adopt a condition GAN, add the direction information of a target modality during modality input, and train a generator to realize different conversion tasks through different condition directions.

The current conditional GAN-based multi-domain switching method is only suitable for switching each very similar sub-mode inside MRI or CT. The approach of MRI and CT bimodal conversion using two GANs is expensive to expand to multimodal. However, the conversion of registered multi-modality MRI and multi-modality CT has no related mature research.

Disclosure of Invention

The technical problems to be solved by the invention are as follows: in view of the fact that the states of the sub-modalities in the MRI and the CT are very similar but the two modalities of the MRI and the CT are greatly different, the invention provides a conversion method adopting the modularized conditional GAN.

In order to solve the technical problems, the invention adopts the technical scheme that:

a modular GAN-based multi-modality MRI and multi-modality CT conversion method comprises the implementation steps of:

1) judging the type of the task to be executed, if the task is CT image-CT image mode conversion, skipping to execute the step 2), if the task is CT image-MRI image mode conversion, skipping to execute the step 3), if the task is MRI image-MRI image mode conversion, skipping to execute the step 4), if the task is MRI image-CT image mode conversion, skipping to execute the step 5), if the task is CT image-MRI focus mode conversion, skipping to execute the step 6), and if the task is MRI image-CT focus mode conversion, skipping to execute the step 7);

2) combining a CT mode encoder and a CT mode decoder in the trained GAN network to obtain a CT internal multi-mode converter, and converting the input CT image of any mode into a conversion of a target mode to generate a CT image through the CT internal multi-mode converter; withdrawing;

3) combining a CT mode encoder and an MRI mode decoder in the trained GAN network to obtain a CT-MRI multi-mode converter, and converting the input CT image of any mode into a conversion of a target mode through the CT-MRI multi-mode converter to generate an MRI image; withdrawing;

4) combining an MRI modal encoder and an MRI modal decoder in the trained GAN network to obtain an MRI internal multi-modal converter, and converting the input MRI image of any modality into a conversion of a target modality to generate an MRI image through the MRI internal multi-modal converter; withdrawing;

5) combining an MRI modal encoder and a CT modal decoder in the trained GAN network to obtain an MRI-CT multi-modal converter, and converting the input MRI image of any modality into a conversion of a target modality to generate a CT image through the MRI-CT multi-modal converter; withdrawing;

6) combining a CT modal encoder in the trained GAN network with an MRI focus task decoder to obtain an MRI focus task processor, and converting an input CT image of any modality into an MRI focus task through the MRI focus task processor; withdrawing;

7) and combining the MRI modal encoder in the trained GAN network with a CT focus task decoder to obtain a CT focus task processor, and converting the input MRI image of any modality into a CT focus task by the CT focus task processor.

Optionally, the step of generating a map by converting the CT map of the input arbitrary modality into a conversion of the target modality through the CT internal multi-modality converter in step 2) includes: coding a CT image of any modality through a CT modality coder of a CT internal multi-modality converter to obtain a semantic feature image, stacking the semantic feature image and an independent hot condition vector channel for selecting a target modality, converting the obtained image through a CT modality decoder of the CT internal multi-modality converter to generate a CT image, and converting a focus label of the generated CT image into a label of a CT focus task_x。

Optionally, the step of generating an MRI map by converting the input CT map of any modality into a conversion of a target modality through a CT-MRI multi-modality converter in step 3) includes: encoding a CT image of any modality through a CT modality encoder of a CT-MRI multi-modality converter to obtain a semantic feature image, stacking the semantic feature image and an independent thermal condition vector channel for selecting a target modality, converting the obtained image through an MRI modality decoder of the CT-MRI multi-modality converter to generate an MRI image, and converting a focus label of the generated MRI image into a label of an MRI focus task_y。

Optionally, the step of generating a map by converting the input MRI map of any modality into a conversion of the target modality through the MRI internal multi-modality converter in step 4) includes: will be any modalityThe MRI graph is encoded by an MRI modality encoder of an MRI internal multi-modality converter to obtain a semantic feature graph, then the semantic feature graph and an unique condition vector channel for selecting a target modality are stacked, finally the MRI graph is converted by an MRI modality decoder of the MRI internal multi-modality converter to generate a target modality conversion to generate the MRI graph, and a focus label of the MRI graph generated by the conversion is a label of an MRI focus task_y。

Optionally, the step of generating the CT map by converting the input MRI map of any modality through the MRI-CT multi-modality converter in step 5) to generate the conversion of the target modality includes: encoding an MRI image of any modality through an MRI modality encoder of an MRI-CT multi-modality converter to obtain a semantic feature image, stacking the semantic feature image and an unique condition vector channel for selecting a target modality, converting the unique condition vector channel into a CT image through a CT modality decoder of the MRI-CT multi-modality converter, and generating a CT image through conversion of the target modality, wherein a focus label of the CT image generated through conversion is a label of an MRI focus task_y。

Optionally, the step of generating an MRI lesion task by transforming the input CT image of any modality through the MRI lesion task processor in step 6) includes: encoding any modality CT image through a CT modality encoder of an MRI focus task processor to obtain a semantic feature image, and generating an MRI focus task through an MRI focus task decoder of the MRI focus task processor, wherein the focus label of the MRI focus task is label_y。

Optionally, the step of generating a CT lesion task by transforming the input MRI image of any modality through the CT lesion task processor in step 7) includes: encoding MRI image of any modality by MRI modality encoder of CT focus task processor to obtain semantic feature image, and generating CT focus task by CT focus task decoder of CT focus task processor, wherein focus label of MRI focus task is label of label_x。

Optionally, step 1) is preceded by a step of training a GAN network, and the detailed steps include:

s1), designing each component of the GAN network, wherein each component of the GAN network comprises an MRI modality encoder, a CT modality encoder, an MRI modality decoder, a CT modality decoder, an MRI lesion task decoder, a CT lesion task decoder, a modality identifier and a feature identifier;

s2) obtaining a registered lesion label with a corresponding lesion processing task_xCT multi-modality training data of (a) a registered lesion label with a corresponding lesion processing task_yThe MRI multi-modal training data is used as training data, and each mode and sub-mode of the training data do not need to be registered;

s3) combining the MRI modal encoder and the CT focus task decoder to obtain a CT focus task processor based on the registered focus label with the corresponding focus processing task_xThe CT multi-mode training data is used for focus task processing training of the CT focus task processor; combining CT mode encoder and MRI focus task decoder to obtain MRI focus task processor based on registered focus label with corresponding focus processing task_yThe MRI multi-mode training data is used for carrying out focus task processing training of the MRI focus task processor; the CT modal encoder and the CT modal decoder form a CT internal multi-mode converter and perform CT to CT training based on training data, the CT modal encoder and the MRI modal decoder form a CT-MRI multi-mode converter and perform CT to MRI training based on the training data, the MRI modal encoder and the MRI modal decoder form an MRI internal multi-mode converter and perform MRI to MRI training based on the training data, and the MRI modal encoder and the CT modal decoder form an MRI-CT multi-mode converter and perform MRI to CT training based on the training data;

s4) respectively labeling the focus label in the CT multi-modal training data_xAnd focus label obtained by training CT to CT_xAnd focus label obtained by CT-to-MRI training_xAnd focus label obtained by CT focus task processing training_xComparing, and respectively labeling the focus label in MRI multi-modal training data_yAnd a lesion label obtained by training MRI to MRI_yAnd a focus label obtained by training of MRI to CT_yAnd a focus label obtained by MRI focus task processing training_yGo on toA ratio; skipping to execute the step S3) to continue the training if the comparison result of any training can not reach the requirement, and ending and exiting otherwise.

In addition, the present invention also provides a modular GAN-based multi-modal MRI and multi-modal CT conversion system, which includes a computer device programmed or configured to execute the steps of the modular GAN-based multi-modal MRI and multi-modal CT conversion method, or a storage medium of the computer device having stored thereon a computer program programmed or configured to execute the modular GAN-based multi-modal MRI and multi-modal CT conversion method.

Furthermore, the present invention also provides a computer readable storage medium having stored thereon a computer program programmed or configured to execute the modular GAN-based multi-modality MRI and multi-modality CT conversion method.

Compared with the prior art, the invention has the following advantages: the invention provides a conversion method adopting modularized conditional GAN in consideration of the condition that the states of internal submodels of MRI and CT are very similar but the two modes of MRI and CT are greatly different.

Drawings

FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.

FIG. 2 is a schematic diagram of the principle of the combined use of the method according to the embodiment of the present invention.

Fig. 3 is a schematic diagram of a main process of GAN network training according to an embodiment of the present invention.

FIG. 4 is a schematic diagram of a core process of module combination training according to an embodiment of the present invention

Fig. 5 is a schematic diagram illustrating a principle of mutual transformation training of intra-modal submodels according to an embodiment of the present invention.

Fig. 6 is a schematic diagram illustrating the principle of CT image and MRI modality interactive training according to an embodiment of the present invention.

Fig. 7 is a schematic diagram illustrating a training principle of a segmentation detection network according to an embodiment of the present invention.

Fig. 8 is a schematic diagram of a GAN network authentication flow according to an embodiment of the present invention.

Fig. 9 is a schematic diagram illustrating a principle of using a GAN network according to an embodiment of the present invention.

Detailed Description

The conversion method, system and medium of the modular GAN-based multi-modal MRI and multi-modal CT will be further described in detail below by taking the example of the conversion of four MRI modalities of lung MRI, T1, T2, T1c and Flair, and the two CT modalities of high-dose CT map and PET-CT map of brain. It is needless to say that, on the basis of obtaining the example, the conversion method, system and medium of the modular GAN-based multi-modal MRI and multi-modal CT of the present invention can be easily applied to more different sites and MRI and CT modalities of different lesion processing tasks by those skilled in the art.

As shown in fig. 1, the implementation steps of the conversion method of the modular GAN-based multi-modality MRI and multi-modality CT in the present embodiment include:

As shown in fig. 2(a), the step of generating a CT image of an arbitrary input modality by converting the CT image into a conversion of a target modality in step 2) by a CT internal multi-modality converter includes: CT image x of any modality_iCT modality encoder encoding EC through CT internal multi-modality converter_xObtaining a semantic feature map code, stacking the semantic feature map code with a one _ hot (j) channel direction of an independent condition vector for selecting a target mode j, and finally passing through a CT mode decoder DC of a CT internal multi-mode converter_xConversion to generate target modality conversion to generate CT image x_t,i,jAnd the conversion generates a CT image x_t,i,jThe focus label is label of CT focus task_x。

As shown in fig. 2(c), the step of generating the MRI map by converting the input CT map of the arbitrary modality into the conversion of the target modality through the CT-MRI multi-modality converter in step 3) includes: CT image x of any modality_iCT modality encoder EC through CT-MRI multi-modality converter_xCoding to obtain a semantic feature map code, and stacking the semantic feature map code with a one-hot condition vector (one _ hot (j) channel direction for selecting a target mode jFinally, MRI modality decoder DC through CT-MRI multi-modality converter_yTransformation generating target modality transformation generating MRI map y_t,i,jAnd the conversion generates an MRI image y_t,i,jThe focus label of (1) is label of MRI focus task_y。

As shown in fig. 2(b), the step of converting the input MRI image of an arbitrary modality into a conversion generation image of a target modality by the MRI internal multi-modality converter in step 4) includes: MRI image y of an arbitrary modality_iMRI modality encoder EC through MRI internal multi-modality converter_yEncoding to obtain semantic feature map code, stacking with one _ hot (j) channel direction of unique condition vector for selecting target mode j, and finally passing through MRI mode decoder DC of MRI internal multi-mode converter_yTransformation generating target modality transformation generating MRI map y_t,x,i,jAnd the conversion generates an MRI image y_t,x,i,jThe focus label of (1) is label of MRI focus task_y。

As shown in fig. 2(d), the step of generating a CT map by converting the input MRI map of an arbitrary modality into a conversion of a target modality by an MRI-CT multi-modality converter in step 5) includes: MRI image y of an arbitrary modality_iMRI modality encoder EC through MRI-CT multimodal converter_yEncoding to obtain semantic feature map code, stacking with one-hot condition vector (one _ hot (j)) channel direction for selecting target mode j, and finally passing through CT mode decoder DC of MRI-CT multi-mode converter_xConversion to generate target modality conversion to generate CT image y_t,y,i,jAnd the conversion generates a CT image y_t,y,i,jThe focus label of (1) is label of MRI focus task_y。

As shown in fig. 2(e), the step of generating the MRI lesion task by converting the input CT image of the arbitrary modality by the MRI lesion task processor in step 6) includes: CT image x of any modality_iCT modality encoder EC through MRI lesion task processor_xEncoding to obtain semantic feature map code, and then passing through MRI focus task decoder DC of MRI focus task processor_l,yGenerating MRI lesion task label_g,y,iAnd the MRI focus task label_g,y,iThe focus label of is label_y。

As shown in fig. 2(f), the step of generating a CT lesion task by transforming the input MRI image of any modality through the CT lesion task processor in step 7) includes: MRI image y of an arbitrary modality_iMRI modality encoder EC through CT lesion task processor_yCoding to obtain semantic feature map code, and then passing through CT focus task decoder DC of CT focus task processor_l,xGenerating a CT lesion task label_g,x,iAnd the MRI focus task label_g,x,iThe focus label of is label_x。

The GAN network of this embodiment includes an MRI modality encoder, a CT modality encoder, an MRI modality decoder, a CT modality decoder, an MRI lesion task decoder, a CT lesion task decoder, a modality discriminator, and a feature discriminator during the training process. And in the testing process, an MRI focus task processor formed by combining an MRI modal encoder and an MRI focus task decoder and a CT focus task processor formed by combining a CT modal encoder and a CT focus task decoder need to be trained independently. The modality encoder receives as input the MRI or CT map and encodes the revenue as a semantic feature map. The modal decoder takes a semantic feature map output by the modal encoder and a condition feature map formed by stacking one-hot condition vector in the channel direction as input, and determines the output mode by the one-hot condition vector. The modality discriminator has three outputs: true or false, CT or MRI, integer indexing of sub-modalities within a CT or MRI modality. The signature discriminator has only one output: CT or MRI. The output of the focus task decoder is related to the processed focus task, if the focus task is a segmentation task, a focus segmentation map is output, and if the focus task is a detection task, the size and the coordinates of a focus detection frame are output.

As shown in fig. 3, the method further includes a step of training a GAN network before step 1), and the detailed steps include:

s4) respectively labeling the focus label in the CT multi-modal training data_xAnd focus label obtained by training CT to CT_xAnd focus label obtained by CT-to-MRI training_xAnd focus label obtained by CT focus task processing training_xComparing, and respectively labeling the focus label in MRI multi-modal training data_yAnd a lesion label obtained by training MRI to MRI_yAnd a focus label obtained by training of MRI to CT_yAnd a focus label obtained by MRI focus task processing training_yTo carry outComparing; skipping to execute the step S3) to continue the training if the comparison result of any training can not reach the requirement, and ending and exiting otherwise.

The data preparation method of the training in this embodiment is as follows: first, data of MRI T1, T2, T1c, Flair four MRI modalities need to be prepared, and is denoted as y₀、y₁、y₂、y₃If the corresponding lesion processing task is a lung tumor segmentation task, the label is a tumor segmentation label and is recorded as label_y. Then, the data of two CT modalities, namely a high-dose CT image and a PET-CT image, of the lung are required to be prepared and are marked as x₀、x₁If the corresponding lesion processing task is a lung nodule detection task, the label is the size and coordinate label of a lung nodule detection frame, and is recorded as label_x. And a data preprocessing process, wherein all data are normalized, and image data of MRI and CT modalities are adjusted to the same size through an up-sampling and down-sampling method. Then, the data is divided into training sets and test sets according to a certain proportion.

The module combination training process in this embodiment S3) includes four training parts: CT to CT training, MRI to MRI training, CT to MRI training, MRI to CT training. The training process uses real CT and MRI training data, each modality and sub-modality need not be registered, but requires CT images to have registered lesion labels label corresponding to lesion processing tasks_xMRI with registered lesion label corresponding to lesion processing task_y. The module combination training core process is shown in fig. 4.

Step S1) of this embodiment, when designing each component of the GAN network, it is necessary to design a network structure of the MRI modality encoder, the CT modality encoder, the MRI modality decoder, the CT modality decoder, the lung tumor segmentation task decoder, the lung nodule detection task decoder, the modality discriminator, and the feature discriminator, and the output size of the encoder is consistent with the input size requirement of the decoder. The subsequent training process specifically comprises;

1. synchronous training: and synchronously training the eight modules of the MRI modal encoder, the CT modal encoder, the MRI modal decoder, the CT modal decoder, the lung tumor segmentation task decoder, the lung nodule detection task decoder, the modal discriminator and the feature discriminator by using the real training set data.

2. Training the lesion processor alone: a CT modality lung nodule detection task processor trained solely from a combination of a CT modality encoder and lung nodule detection task decoder using real CT training data, and a MRI modality lung nodule segmentation task processor trained from a combination of an MRI modality encoder and lung nodule segmentation task decoder using real MRI training data.

3. Module recombination: and recombining each trained module in the synchronous training. The CT modal encoder and the CT modal decoder are combined to obtain a generator of CT internal sub-modal interconversion, the MRI modal encoder and the MRI modal decoder are combined to obtain a generator of MRI internal sub-modal interconversion, the CT modal encoder and the MRI modal decoder are combined to obtain a generator of CT internal sub-modal conversion into MRI internal sub-modal, the MRI modal encoder and the CT modal decoder are combined to obtain a generator of MRI internal sub-modal conversion into CT internal sub-modal, the CT modal encoder and the lung tumor segmentation task decoder are combined to obtain a lung tumor segmentation task processor of CT modal, and the MRI modal encoder and the lung nodule detection task decoder are combined to obtain a lung nodule detection task processor of MRI modal.

4. The single-modal data conversion generates multi-modal data: and reconstructing or converting each single-mode data in the test set by using each generator obtained by module recombination to generate all the mode data. Finally, we will construct six generated data sets consistent with the test set modality and number. The MRI data obtained by the CT conversion has a size coordinate label of a pulmonary nodule detection frame, and the CT data obtained by the MRI conversion has a segmentation label of a lung tumor.

5. Lesion treatment and assessment: and respectively carrying out corresponding tumor segmentation processing or lung nodule detection processing on six multi-modal data sets generated through single-modal data conversion by using a lung nodule detection task processor in a CT (computed tomography) mode and a lung tumor segmentation task processor in an MRI (magnetic resonance imaging) mode which are trained independently, and the lung tumor segmentation task processor in the CT mode and the lung nodule detection task processor in the MRI mode which are obtained by module recombination. Then, the processing result is compared with the real label for evaluation, if the evaluation index reaches the expectation, the quality of the multi-mode data generated by conversion is good, otherwise, the network structure of each module needs to be redesigned and retrained.

As shown in fig. 4 and 5, when CT-to-CT training is performed, a CT diagram of a sub-mode i of any CT mode is x_iIn this embodiment, x is encoded using a CT modality encoder_iEncoding idiom feature map code_x,iAnd then stacking the unique heat condition vectors corresponding to all the sub-modes of the CT mode on channels of the CT mode respectively to obtain different condition characteristic graphs. code_x,iStacking and combining with one _ hot (i) one of the CT mode sub-mode i, decoding and restoring x by using a CT mode decoder_iReconstructed CT image x of_r,i。code_x,iAfter stacking and combining with the one _ hot (j) independent condition vector of any other sub-mode j (j is not equal to i) of the CT mode, decoding by using a CT mode decoder to obtain a CT image x of the sub-mode j of the CT mode_t,i,jReuse CT mode encoder x_t,i,jCoding to obtain semantic feature map code_t,i,x,jThen, it is stacked and combined with one _ hot (i) and decoded by CT mode decoder to restore x_iCircularly reconstructed CT image x_cr,j,i. In addition, CT focus task decoder pair code_x,iAnd all codes_t,i,x,jAll are processed to obtain focus label output label_g,x,iAnd label_t,j,x,iThe corresponding real label is label_x,i. At the same time, the mode discriminator is x_iIs a positive sample x_t,i,jPerforming true and false discriminative learning for negative examples provides antagonistic losses for generating components, while the mode discriminator uses i as x_iIs a label of_t,i,jThe label of (1) discriminates and learns the sub-modal class and takes the index value of the CT modality as x_iAnd x_t,i,jDiscriminative learning of CT and MRI modality classifications provides consistency loss for the generating component. The characteristic discriminator takes the index value of CT mode as code_x,iThe tags of (a) discriminatively learn CT and MRI modality classifications to provide a resistance loss for the generating component.

As shown in fig. 4 and 5, the MRI-to-MRI training process is the same as the CT-to-CT training process except that the training components are changed to an MRI modality encoder, an MRI modality decoder, and an MRI lesion task decoder, and an MRI image y of a sub-modality i changed to an MRI modality is inputted_iThe learned labels are changed to labels corresponding to the MRI modalities.

As shown in fig. 4 and 6, when the CT-to-MRI training is performed, a CT image x of a sub-mode i of an arbitrary CT mode is obtained by using a CT mode encoder_iEncoding idiom feature map code_x,iAnd then stacking the unique-heat condition vectors corresponding to all the sub-modes of the MRI mode on channels of the MRI mode respectively to obtain different condition characteristic maps. code_x,iAfter stacking and combining with the one _ hot (j) independent condition vector of any sub-mode j of the MRI mode, decoding by an MRI mode decoder to obtain the MRI y of the sub-mode j of the MRI mode_t,x,i,jReuse of MRI modality encoder y_t,x,i,jCoding to obtain semantic feature map code_t,x,i,y,jThen, it is stacked and combined with one _ hot (i) and decoded by CT mode decoder to restore x_iCircularly reconstructed CT image x_cr,y,j,i. Likewise, CT focus task decoder pair code_x,iAnd all codes_t,x,i,y,jAll are processed to obtain focus label output label_g,x,iAnd label_t,y,j,x,iThe corresponding real label is label_x,i. At the same time, the mode discriminator is x_iIs a positive sample y_t,x,i,jPerforming true and false discriminative learning for negative examples provides antagonistic losses for generating components, while the mode discriminator uses i as x_iA label of (j) is y_t,x,i,jThe label of (1) discriminates and learns the sub-modal class and takes the index value of the CT modality as x_iLabel of (2), index value of MRI modality is y_t,x,i,jDiscriminative learning of CT and MRI modality classifications provides consistency loss for the generating component.

The process of performing MRI to CT training is the same as CT to MRI training, except that the corresponding CT and MRI modalities are exchanged for input, components, labels, etc.

As shown in fig. 7, in this embodiment, training of the segmentation detection network is further required, including: CT image x of any modality_iCT modality encoder EC through MRI lesion task processor_xCoding to obtain semantic feature map code, and then passing through CT focus task decoder DC of CT focus task processor_l,xGenerating a CT lesion task label_g,x,iAnd the CT focus task label_g,x,iThe focus label of is label_x. MRI image y of an arbitrary modality_iMRI modality encoder EC through CT lesion task processor_yEncoding to obtain semantic feature map code, and then passing through MRI focus task decoder DC of CT focus task processor_l,yGenerating a CT lesion task label_g,y,iAnd the MRI focus task label_g,y,iThe focus label of is label_y。

In addition, the embodiment further includes a step of performing lesion validity detection on the trained GAN network: as shown in fig. 8, a CT lesion task processor obtained by combining a CT modal encoder and a CT lesion task decoder is trained using real CT training data, and the processing capability of the processor is detected on real CT test data, and when the evaluation index of the processing result of the test set reaches a predetermined value, it indicates that the CT lesion task processor has been trained. Using the trained CT focus task processor to apply label with CT focus task_xAnd processing the CT image and evaluating the processing result, wherein if the obtained evaluation result is in line with expectation, the generated data is good, otherwise, the network structure of the module is adjusted to be retrained. Loss function loss of the processor training of the focus task_labelComprises the following steps:

in the above formula, label_x,iLabel, representing the true CT lesion task_g,x,iDecoder DC for representing CT focus task_l,xAnd generating a CT focus task label.

Similarly, we first trained a group using real MRI training dataAnd when the evaluation index of the processing result of the test set reaches the preset value, the MRI focus task processor is trained. We use the trained MRI focus task processor to label with MRI focus task_yAnd processing the MRI and evaluating the processing result, if the obtained evaluation result is in line with expectation, indicating that the generated data is good, otherwise, adjusting the network structure of the module to retrain. The loss function for the processor training of the focus task is:

in the above formula, label_y,iLabel representing the actual MRI lesion task_g,y,iDecoder DC for representing MRI focus task_l,yA generated MRI lesion task label.

Since it is difficult to obtain label with MRI focus task_yTherefore, the embodiment uses the MRI lesion task processor obtained by combining the CT modality encoder and the MRI lesion task decoder through the training process to generate the label with the MRI lesion task_yAnd (4) performing focus processing on the CT image and evaluating a processing result, wherein if the obtained evaluation result is in accordance with expectation, the generated data is good, otherwise, the network structure of the module is required to be adjusted for retraining. Similarly, label with CT lesion task is difficult to obtain_xTherefore, the embodiment uses the CT lesion task processor obtained by combining the MRI modality encoder and the CT lesion task decoder through the training process to generate the label with the CT lesion task_xAnd performing focus treatment on the MRI, evaluating a treatment result, if the obtained evaluation result is in line with expectation, indicating that the generated data is good, and otherwise, adjusting the network structure of the module to retrain.

In this embodiment, the bidirectional countermeasure loss function of the GAN network is designed as follows:

1. the modality Discriminator is updated independently, and the output is a list with three elements, and the specific loss is as follows:

1.1, true and false identification loss:

in the above formula, loss_{Discriminator,1}Representing a loss of true or false discrimination; discriminator (x)_i)[0]Showing the actual CT image x_iThe 1 st identification output result of the input identifier has a value of 0 or 1, and respectively represents that the identification result is a composite graph and a real graph; the other symbols may be analogized from the preceding symbol description.

1.2, CT or MRI modality discrimination loss:

in the above formula, loss_{Discriminator,2}Represents a loss of CT or MRI modality; discriminator (x)_i)[1]Showing the actual CT image x_iThe 2 nd identification output result of the input identifier has a value of 0 or 1, and respectively represents that the identification result is a CT image and an MRI; the other symbols may be analogized from the preceding symbol description.

1.3, sub-mode discrimination loss:

in the above formula, loss_{Discriminator,3}Represents a sub-modality discrimination loss of the CT or MRI modality; discriminator (x)_i)[2]Showing the actual CT image x_iThe 3 rd identification output result of the input identifier is a mode label with an integral value, and respectively represents that the identification result is a sub-mode of a CT image and an MRI; the other symbols may be analogized from the preceding symbol description.

The total mode discriminator loss is the sum of the three, which can be expressed as:

loss_{Discriminator}＝loss_{Discriminator,1}+loss_{Discriminator,2}+loss_{Discriminator,3}

2. the feature discriminator is updated independently and provides antagonistic losses, referred to as bidirectional antagonistic losses in this embodiment, to both the MRI modality generating component and the CT modality generating component, and the specific feature discriminator loss function is as follows:

in the above formula, loss_{FeatureDiscriminator}Represents the loss of the feature discriminator; FeatureDiscrimidator (code)_x,i) Representing by code_x,iThe output result of the modality identification of the input feature map is 0 or 1, which respectively represents that the identification result is a CT map and an MRI.

3. Other modules of the embodiment are updated and trained through an optimizer, and the loss items comprise guidance loss provided by a modality discriminator, bidirectional confrontation loss provided by a feature discriminator, modality reconstruction self-supervision loss, modality cycle reconstruction consistency loss, semantic consistency loss, lesion supervision loss and lesion consistency loss. The method comprises the following specific steps:

3.1, modality discriminator guided loss:

in the above formula, loss_{Adversarial,1}Representing a guidance loss including resistance loss and category consistency loss provided by a modality Discriminator; discriminator (x)_t,i,j)[0]Representation of CT image x synthesized by conversion_t,i,jThe 1 st identification output result of the input identifier has a value of 0 or 1, and respectively represents that the identification result is a composite graph and a real graph; discriminator: (x_t,i,j)[1]Representation of CT image x synthesized by conversion_t,i,jThe 2 nd identification output result of the input identifier has a value of 0 or 1, and respectively represents that the identification result is a CT image and an MRI; discriminator (x)_t,i,j)[1]Representation of CT image x synthesized by conversion_t,i,jThe 3 rd identification output result of the input identifier is a mode label with an integral value, and respectively represents that the identification result is a sub-mode of a CT image and an MRI; the other symbols may be analogized from the preceding symbol description.

3.2, feature discriminator bidirectional countermeasure loss:

in the above formula, loss_{Adversarial,2}Representing the two-way countermeasure loss simultaneously provided by the feature discriminator FeatureDiscriminator to the CT modality generating component and the MRI modality generating component; FeatureDiscrimidator (code)_x,i) Representing by code_x,iThe output value of the mode discrimination for the input feature map is 0 or 1, which indicates that the discrimination result is CT map and MRI, FeatureDiscriminitor (code), respectively_y,i) Representing by code_y,iAnd outputting the result for the mode identification of the input feature diagram.

3.3, modal reconstruction self-supervision loss:

in the above formula, loss_rebuildRepresents the modal reconstruction auto-supervision loss, x_iIs a true CT image, x_r,iIs a reconstructed image of the CT modality.

3.4, modal cycle reconstruction self-supervision loss:

in the above formula, loss_{cycle,rebuild}Represents modal cycle reconstruction auto-supervision loss, x_cr,i,jRepresenting intra-modality submodel interactions of CT modalitiesRotated circular reconstructed map, x_cr,y,i,jA circular reconstruction diagram which is obtained by converting the CT diagram into the MRI and then converting the MRI; the other symbols may be analogized from the preceding symbol description.

3.5, modal cycle reconstruction consistency loss:

in the above formula, loss_{cycle,consistency}Representing the consistency loss of modal cycle reconstruction, namely converting the same real graph into an original graph after different intermediate modal conversions; the symbols in the formula are as described in the preceding symbol.

3.6, loss of semantic consistency:

in the above formula, loss_{code,consistency}Representing semantic consistency loss, code_x,i、code_t,i,x,j、code_t,x,i,y,jRespectively representing the actual CT images x_iThe encoding result is directly encoded by an encoder, is converted into the recoding of other CT modes and is converted into the recoding of an MRI mode.

3.7, lesion surveillance loss:

in the above formula, loss_labelIndicating lesion surveillance loss; label_t,x,j,y,iLabels representing CT lesion tasks generated using a lesion processor after a real CT image is converted to MRI; the rest symbols in the formula can be obtained by analogy with the symbol descriptions in the previous step.

3.8 loss of focal consistency

In the above formula, loss_{label,consistency}Expressing lesion consistency loss, namely, the consistency loss of labels generated by a lesion processor of a converted graph obtained by converting the same real graph through different intermediate modes; label_t,x,j,y,iLabels representing CT lesion tasks generated using a lesion processor after a real CT image is converted to MRI; the rest symbols in the formula can be obtained by analogy with the symbol descriptions in the previous step.

Thus, the total loss of each term generator consisting of the encoder-decoder is the sum of the above losses, which can be expressed as:

loss_Generator＝loss_{Adversarial,1}+loss_{Adversarial,2}+loss_rebuild+loss_{cycle,rebuild}+loss_{cycle,consistency}+loss_{code,consistency}+loss_label+loss_{label,consistency}

finally, the various modules of the trained GAN network can be used in combination: as shown in fig. 9, 1, a CT modal encoder and a CT modal decoder are combined to obtain a CT internal multi-modal converter, a CT image of any modality is encoded by the encoder to obtain a semantic feature map, and then stacked with the unique condition vector channels of the selected sub-modalities of the CT modality, and finally converted by the CT modal decoder to generate the CT image of the selected modality. Label for converting focus label for generating image into CT focus task_x. 2. The MRI modal encoder and the MRI modal decoder are combined to obtain an MRI internal multi-modal converter, an MRI image of any modality is encoded by the encoder to obtain a semantic feature map, the semantic feature map is stacked with the unique condition vector channel direction of the selected MRI modality submodel, and finally the MRI image of the selected modality can be converted and generated by the MRI modal decoder. Converting the lesion label generating the map to the label of the MRI lesion task_y. 3. The CT mode encoder and the MRI mode decoder are combined to obtain a CT-MRI multi-mode converter, the CT image of any mode is encoded by the encoder to obtain a semantic feature image, and then the semantic feature image and the selected one-hot condition direction of the sub-mode of the MRI modeAnd (4) stacking the channels, and finally converting the MRI generated by the selected mode through an MRI mode decoder. Label for converting focus label for generating image into CT focus task_x. 4. The MRI modal encoder and the CT modal decoder are combined to obtain an MRI-CT multi-modal converter, an MRI image of any modality is encoded by the encoder to obtain a semantic feature map, the semantic feature map is stacked with the unique condition vector channel of the selected CT modality submodel, and finally the CT image of the selected modality can be converted and generated by the CT modal decoder. Converting the lesion label generating the map to the label of the MRI lesion task_y. 5. The CT image of any modality is coded by the coder to obtain a semantic feature image, and then the MRI focus task decoder is used for generating a processing result of the MRI focus task. The processor can process input CT image and focus label as label_yThe MRI focal task of (1). 6. The CT focus task processor can be obtained by combining the MRI modal encoder and the CT focus task decoder, the MRI of any modality obtains a semantic feature map through the encoder encoding, and then the CT focus task decoder can generate the processing result of the CT focus task. The processor can process the input as MRI and the focus label as label_xThe CT focus task of (1).

In addition, the present embodiment further provides a system for converting multi-modal MRI and multi-modal CT based on modular GAN, which includes a computer device programmed or configured to execute the steps of the method for converting multi-modal MRI and multi-modal CT based on modular GAN, or a storage medium of the computer device having a computer program stored thereon, the computer program being programmed or configured to execute the method for converting multi-modal MRI and multi-modal CT based on modular GAN.

In addition, the present embodiment also provides a computer readable storage medium, which stores a computer program programmed or configured to execute the method for converting the modular GAN-based multi-modality MRI and multi-modality CT according to the present embodiment.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims

1. A modular GAN-based multi-modality MRI and multi-modality CT conversion method is characterized by comprising the following implementation steps:

2. The method as claimed in claim 1, wherein the step of generating the map by converting the inputted CT map of any modality into the target modality through the CT internal multi-modality converter in step 2) comprises: coding a CT image of any modality through a CT modality coder of a CT internal multi-modality converter to obtain a semantic feature image, stacking the semantic feature image and an independent hot condition vector channel for selecting a target modality, converting the obtained image through a CT modality decoder of the CT internal multi-modality converter to generate a CT image, and converting a focus label of the generated CT image into a label of a CT focus task_x。

3. The method as claimed in claim 1, wherein the step of generating the MRI map by transforming the inputted CT map of any modality into the transformation of the target modality through the CT-MRI multi-modality converter in step 3) comprises: encoding a CT image of any modality through a CT modality encoder of a CT-MRI multi-modality converter to obtain a semantic feature image, stacking the semantic feature image and an independent thermal condition vector channel for selecting a target modality, converting the obtained image through an MRI modality decoder of the CT-MRI multi-modality converter to generate an MRI image, and converting a focus label of the generated MRI image into a label of an MRI focus task_y。

4. The method for converting multi-modal MRI and multi-modal CT based on modular GAN as claimed in claim 1, wherein the step of converting the MRI image of any input modality into the target modality by the MRI internal multi-modal converter in step 4) to generate the image comprises: encoding an MRI image of any modality through an MRI modality encoder of an MRI internal multi-modality converter to obtain a semantic feature map, stacking the semantic feature map and an unique condition vector channel for selecting a target modality, converting the MRI image into an MRI image through an MRI modality decoder of the MRI internal multi-modality converter, and generating an MRI image through conversion, wherein a focus label of the MRI image is a label of an MRI focus task_y。

5. The modular GAN-based multi-modal MRI and multi-modal CT conversion method according to claim 1, wherein the step of generating the CT map by converting the input MRI map of any modality through the MRI-CT multi-modal converter in step 5) to generate the conversion of the target modality comprises: encoding an MRI image of any modality through an MRI modality encoder of an MRI-CT multi-modality converter to obtain a semantic feature image, stacking the semantic feature image and an unique condition vector channel for selecting a target modality, converting the unique condition vector channel into a CT image through a CT modality decoder of the MRI-CT multi-modality converter, and generating a CT image through conversion of the target modality, wherein a focus label of the CT image generated through conversion is a label of an MRI focus task_y。

6. The method for converting multi-modal MRI and multi-modal CT based on modular GAN as claimed in claim 1, wherein the step of converting the inputted CT image of any modality into the MRI lesion task by the MRI lesion task processor in step 6) comprises: encoding any modality CT image through a CT modality encoder of an MRI focus task processor to obtain a semantic feature image, and generating an MRI focus task through an MRI focus task decoder of the MRI focus task processor, wherein the focus label of the MRI focus task is label_y。

7. The modular GAN-based multimodal M as claimed in claim 1The method for converting RI and multi-modal CT is characterized in that the step of converting the input MRI image of any modality into a CT lesion task through a CT lesion task processor in the step 7) comprises the following steps: encoding MRI image of any modality by MRI modality encoder of CT focus task processor to obtain semantic feature image, and generating CT focus task by CT focus task decoder of CT focus task processor, wherein focus label of MRI focus task is label of label_x。

8. The modular GAN-based multi-modality MRI and multi-modality CT conversion method according to claim 1, wherein the step 1) is preceded by a step of training a GAN network, and the detailed steps comprise:

s3) combining the MRI modal encoder and the CT focus task decoder to obtain a CT focus task processor based on the registered focus label with the corresponding focus processing task_xThe CT multi-mode training data is used for focus task processing training of the CT focus task processor; combining CT mode encoder and MRI focus task decoder to obtain MRI focus task processor based on registered focus label with corresponding focus processing task_yThe MRI multi-mode training data is used for carrying out focus task processing training of the MRI focus task processor; the CT modal encoder and the CT modal decoder form a CT internal multi-mode converter, CT to CT training is carried out based on training data, and the CT modal encoder and the MRI modal decoder form CT-MThe RI multi-mode converter performs CT-to-MRI training based on training data, the MRI modal encoder and the MRI modal decoder form an MRI internal multi-mode converter, the MRI-to-MRI training is performed based on the training data, the MRI-CT multi-mode converter is formed by the MRI modal encoder and the CT modal decoder, and the MRI-to-CT training is performed based on the training data;

s4) respectively labeling the focus label in the CT multi-modal training data_xAnd focus label obtained by training CT to CT_xAnd focus label obtained by CT-to-MRI training_xAnd focus label obtained by CT focus task processing training_xComparing, and respectively labeling the focus label in MRI multi-modal training data_yAnd a lesion label obtained by training MRI to MRI_yAnd a focus label obtained by training of MRI to CT_yAnd a focus label obtained by MRI focus task processing training_yCarrying out comparison; skipping to execute the step S3) to continue the training if the comparison result of any training can not reach the requirement, and ending and exiting otherwise.

9. A modular GAN-based multi-modal MRI and multi-modal CT conversion system comprising a computer device, wherein the computer device is programmed or configured to perform the steps of the modular GAN-based multi-modal MRI and multi-modal CT conversion method of any one of claims 1-8, or a storage medium of the computer device has stored thereon a computer program programmed or configured to perform the modular GAN-based multi-modal MRI and multi-modal CT conversion method of any one of claims 1-8.

10. A computer readable storage medium having stored thereon a computer program programmed or configured to perform the modular GAN based multi-modality MRI and multi-modality CT conversion method of any one of claims 1-8.