CN116778021B

CN116778021B - Medical image generation method, device, electronic equipment and storage medium

Info

Publication number: CN116778021B
Application number: CN202311056376.XA
Authority: CN
Inventors: 罗家佳; 王珠辉
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2023-08-22
Filing date: 2023-08-22
Publication date: 2023-11-07
Anticipated expiration: 2043-08-22
Also published as: CN116778021A

Abstract

The invention provides a medical image generation method, a medical image generation device, electronic equipment and a storage medium, and relates to the technical field of image processing. The method comprises the following steps: acquiring a medical image corresponding to a current mode, and determining a target mode to be generated; inputting the medical image into a first image generation model corresponding to the current mode and the target mode to obtain a target medical image corresponding to the target mode output by the first image generation model; the loss function of the first image generation model is determined based on the generator loss function and the frequency decomposition loss function; the frequency decomposition loss function is determined based on a low frequency component difference determined based on a difference of the low frequency component predicted image and the low frequency component of the first prediction target medical image and a high frequency component difference determined based on a difference of the high frequency component predicted image and the high frequency component of the first prediction target medical image. The method and the device can improve the generation accuracy of the multi-modal medical image.

Description

Medical image generation method, device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a medical image generating method, apparatus, electronic device, and storage medium.

Background

With rapid development of technology, the medical field is becoming more and more intelligent, and clinical diagnosis can be performed by various weighted medical images such as MRI (Magnetic Resonance Imaging ), CT (Computed Tomography, computed tomography) images, and ultrasound images. Medical images of different modes provide different information, so that clinical diagnosis is performed based on the medical images of different modes, and the accuracy and efficiency of clinical diagnosis can be improved. However, acquiring medical images of different modalities requires multiple repeated examinations of the patient, and therefore, medical images of other modalities need to be generated based on medical images of one modality.

Currently, image generation models that can convert modalities are obtained by training a CycleGAN (Cycle Generative adversarial network) model to cyclically generate antagonistic neural networks. However, the training of the CycleGAN model only constrains the image generation against loss and the circulation loss, and a plurality of feasible solutions exist in both constraints, so that the generation accuracy of the image generation model is reduced, and the generation accuracy of the multi-mode medical image is reduced.

Disclosure of Invention

The invention provides a medical image generation method, a medical image generation device, electronic equipment and a storage medium, which are used for solving the defect of low generation accuracy of a multi-mode medical image in the prior art.

The invention provides a medical image generation method, which comprises the following steps:

acquiring a medical image corresponding to a current mode, and determining a target mode to be generated;

inputting the medical image to a first image generation model corresponding to the current mode and the target mode to obtain a target medical image corresponding to the target mode output by the first image generation model;

the first image generation model is used for converting an image corresponding to the current mode into an image corresponding to the target mode; the first image generation model is obtained by training based on a sample medical image corresponding to the current mode and a sample target medical image corresponding to the target mode; the loss function of the first image generation model is determined based on a generator loss function and a frequency decomposition loss function;

the generator loss function is determined based on a first discrimination result of a first prediction target medical image obtained by inputting the sample medical image into the first image generation model and a second discrimination result of the sample target medical image;

The frequency decomposition loss function is determined based on a low-frequency component difference determined based on a difference of a low-frequency component prediction image obtained by inputting a low-frequency component of the sample medical image to the first image generation model and a high-frequency component difference determined based on a difference of a high-frequency component of the first prediction target medical image from a low-frequency component of the first prediction target medical image obtained by inputting a high-frequency component of the sample medical image to the first image generation model.

According to the medical image generation method provided by the invention, the low-frequency component of the sample medical image is determined based on the following steps:

performing nonlinear processing on the sample medical image to obtain a processed sample medical image;

converting the processed sample medical image into a spectral image;

performing low-pass filtering on the spectrum image to obtain a low-frequency component of the spectrum image;

the low frequency components of the spectral image are converted into low frequency components of the sample medical image.

According to the medical image generation method provided by the invention, the sample medical image is subjected to nonlinear processing to obtain a processed sample medical image, and the medical image generation method comprises the following steps:

inputting the sample medical image into a convolutional neural network model to obtain the processed sample medical image output by the convolutional neural network model;

wherein the convolutional neural network model includes an activation layer constructed based on an activation function.

According to the medical image generation method provided by the invention, the method for converting the low-frequency component of the frequency spectrum image into the low-frequency component of the sample medical image comprises the following steps:

converting the low frequency component of the spectrum image into a low frequency component of a spatial domain image;

a low frequency component of the sample medical image is generated based on an absolute value of the low frequency component of the spatial domain image.

According to the medical image generation method provided by the invention, the first image generation model is trained based on the following steps:

inputting the sample medical image into the first image generation model to obtain a first prediction target medical image corresponding to the target mode output by the first image generation model;

Inputting the first prediction target medical image to an image discrimination model corresponding to the target mode to obtain the first discrimination result output by the image discrimination model, and inputting the sample target medical image to the image discrimination model to obtain the second discrimination result output by the image discrimination model;

determining a third discrimination result based on a difference value between a preset normalization value and the second discrimination result;

determining the generator loss function based on the third discrimination result and the first discrimination result;

the first image generation model is trained based on the generator loss function and the frequency decomposition loss function.

According to the medical image generation method provided by the invention, the loss function of the first image generation model is determined based on a generator loss function, a frequency decomposition loss function and a consistency loss function;

the consistency loss function is determined based on a first difference between a first predictive medical image and the sample medical image, the first predictive medical image being obtained by inputting the first predictive target medical image into a second image generation model;

The second image generation model is used for converting the image corresponding to the target mode into the image corresponding to the current mode.

According to the medical image generation method provided by the invention, the consistency loss function is determined based on the first difference and the second difference;

the second difference is determined based on a difference of a second predicted target medical image and the sample target medical image;

the second prediction target medical image is obtained by inputting a second prediction medical image into the first image generation model, and the second prediction medical image is obtained by inputting the sample target medical image into the second image generation model.

The present invention also provides a medical image generation apparatus including:

the image acquisition module is used for acquiring a medical image corresponding to the current mode and determining a target mode to be generated;

the image generation module is used for inputting the medical image into a first image generation model corresponding to the current mode and the target mode to obtain a target medical image corresponding to the target mode output by the first image generation model;

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a medical image generation method as described in any of the above when executing the program.

The invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a medical image generation method as described in any of the above.

The invention provides a medical image generation method, a device, an electronic device and a storage medium, wherein a loss function of a first image generation model for a conversion mode is determined based on a generator loss function and a frequency decomposition loss function, and the frequency decomposition loss function is determined based on a low-frequency component difference and a high-frequency component difference; the low-frequency component difference is determined based on the difference between a low-frequency component prediction image obtained by inputting a low-frequency component of a sample medical image into a first image generation model and a low-frequency component of a first prediction target medical image, so that the first image generation model learns the low-frequency component of the sample medical image, and simultaneously restrains the low-frequency information of the generated image, thereby improving the generation accuracy of the first image generation model and further improving the generation accuracy of the multi-mode medical image; the high-frequency component difference is determined based on the difference between a high-frequency component prediction image obtained by inputting a high-frequency component of a sample medical image into a first image generation model and a high-frequency component of a first prediction target medical image, so that the first image generation model learns the high-frequency component of the sample medical image, and high-frequency information of the generated image is restrained, thereby improving the generation accuracy of the first image generation model and further improving the generation accuracy of the multi-mode medical image; by the method, the high-frequency information and the low-frequency information of the image can be focused at the same time, so that the identity characteristic and the style characteristic of the image are focused, the high-frequency information and the low-frequency information of the original image are reserved, the identity information and the style information of the generated image are reserved, and the generation accuracy of the multi-mode medical image is improved; in addition, the medical image corresponding to the current mode is input into the first image generation model to obtain the target medical image corresponding to the target mode output by the first image generation model, so that single input and single output of the model are realized under the condition that the size of the first image generation model is not increased, the stability and the robustness of the first image generation model are further improved, and finally the generation accuracy of the multi-mode medical image is improved.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a medical image generation method according to the present invention;

FIG. 2 is a second flow chart of the medical image generating method according to the present invention;

FIG. 3 is a third flow chart of the medical image generating method according to the present invention;

FIG. 4 is a schematic diagram showing a determination mode of a consistency loss function according to the present invention;

FIG. 5 is a second schematic diagram of a determination method of a consistency loss function according to the present invention;

FIG. 6 is a schematic diagram of a determination method of a frequency decomposition loss function according to the present invention;

FIG. 7 is a schematic structural view of a medical image generating apparatus according to the present invention;

fig. 8 is a schematic structural diagram of an electronic device provided by the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

With rapid development of technology, the medical field is becoming more and more intelligent, and clinical diagnosis can be performed by various weighted medical images such as MRI (Magnetic Resonance Imaging ), CT (Computed Tomography, computed tomography) images, and ultrasound images. Medical images of different modalities provide different information, such as MRI can display tissue structures, CT can provide high resolution images of anatomical structures, ultrasound images can observe real-time blood flow dynamics, etc., so clinical diagnosis is performed based on medical images of different modalities to obtain more comprehensive and accurate information, and accuracy and efficiency of clinical diagnosis can be improved. However, acquiring medical images of different modalities requires repeated examinations of the patient, thus increasing the risk of discomfort and radiation exposure of the patient, and therefore, medical images of other modalities, i.e. multi-modality medical image translation, need to be generated based on medical images of one modality, in other words, medical images of different modalities need to be integrated and interconverted. In addition, in clinical practice, medical staff cannot accurately acquire medical images of all modalities due to some objective conditions such as machine failure, individual sensitivity difference of patients, external environmental interference, etc., and thus, it is necessary to implement multi-modality medical image translation.

It can be understood that by translating and fusing medical images of different modes, various information can be synthesized, so that doctors can be helped to comprehensively know the illness state and make more accurate diagnosis and treatment schemes. Multimodal medical image translation by integrating and translating different medical images, a physician can obtain the required information in one image set, reducing unnecessary repeated examinations of the patient.

Currently, multi-modal medical image translation is implemented based on deep learning, specifically, a model of image generation that can convert modes is obtained by training a CycleGAN (Cycle Generative adversarial network, cyclic generation antagonistic neural network) model. However, the training of the CycleGAN model only constrains the image generation against loss and the circulation loss, and a plurality of feasible solutions exist in both constraints, so that the generation accuracy of the image generation model is reduced, and the generation accuracy of the multi-mode medical image is reduced.

In addition, an image generation model capable of converting modes is obtained based on FDIT (frequency-based image conversion framework) training at present. However, the training only constrains the reconstruction loss and conversion loss of the image, and only constrains the high-frequency information in the conversion branch, but only weakly constrains the low-frequency information by generating the counterloss, which leads to that the image generation model is excessively concerned with the reservation and reconstruction of the high-frequency information, but neglects the conversion performance of the low-frequency information, and experiments show that the low-frequency information contains most of the information of the image, the effect of constraining the low-frequency information is better than that of constraining the high-frequency information, and on the basis of the constraint, the reservation and reconstruction of the high-frequency information are excessively concerned, so that the generation accuracy of the image generation model is reduced, and the generation accuracy of the multi-mode medical image is further reduced. Meanwhile, the input of the image generation model is two tensors (the style tensor of the sample image corresponding to the target mode and the content tensor of the sample image corresponding to the source mode), and the two tensors are obtained by frequency domain filtering; the stability of the multi-input single-output image generation model is easily influenced by input parameters, so that the generation accuracy of the multi-mode medical image is reduced; and the multi-input single-output image generation model needs input data of multiple modes, so that the generation convenience of the multi-mode medical image is reduced.

In view of the above problems, the present invention proposes the following embodiments. Fig. 1 is a schematic flow chart of a medical image generating method according to the present invention, as shown in fig. 1, where the medical image generating method includes:

step 110, acquiring a medical image corresponding to the current mode, and determining a target mode to be generated.

Here, the current modality is a source modality, the modality of the medical image is a current modality, the medical image is an image to be converted (to be translated), and the current modality may include, but is not limited to: a T1 weighted imaging modality of MRI, a T2 weighted imaging modality of MRI, a Flair weighted imaging modality of MRI, a CT modality, an ultrasound modality, and the like. The target modality is a modality to be converted, i.e., a medical image is converted from a current modality to a target modality, which may include, but is not limited to: CT modality, ultrasound modality, T1 weighted imaging modality of MRI, T2 weighted imaging modality of MRI, flair weighted imaging modality of MRI, and the like.

It should be noted that MRI techniques use strong magnetic fields and radio waves to generate images that can be used to view the structure and function of tissue, as well as to detect any abnormalities, i.e., magnetic resonance imaging can be used for diagnosis and analysis of a variety of diseases. MRI can generate detailed three-dimensional images showing the structure and function of internal tissues of the human body. In clinical medicine, MRI has become a very useful diagnostic tool to assist doctors in the accurate diagnosis and treatment of various diseases. MRI can help doctors to accurately diagnose many diseases including stroke, tumors, multiple sclerosis, herniated discs, cartilage damage, epilepsy, and the like. MRI can produce high resolution images showing different tissue structures and functions, which allows doctors to better diagnose and treat diseases.

Furthermore, in magnetic resonance imaging, there are a number of modalities based on different weights, such as T1 weighting, T2 weighting, flair, etc. The information contained in each modality has a respective emphasis. For certain specific diseases, a certain modality may provide the necessary key information, and therefore, in clinical diagnosis, magnetic resonance imaging of all modalities should be acquired as much as possible.

And 120, inputting the medical image to a first image generation model corresponding to the current mode and the target mode, and obtaining a target medical image corresponding to the target mode output by the first image generation model.

The first image generation model is used for converting an image corresponding to the current mode into an image corresponding to the target mode. Different current modalities and target modalities correspond to different image generation models, i.e. the first image generation model is used for translating an image corresponding to the current modality into an image corresponding to the target modality, e.g. the first image generation model is used for translating MRI into a CT image.

The first image generation model is a GAN (Generative adversarial network), generating an antagonistic neural network model. GAN is a popular generative deep learning framework that continuously optimizes the effect of generating images by constantly opposing the discriminators to the generator. In a specific embodiment, inputting a medical image to a coding layer (Encoder layer) of a first image generation model to obtain image features output by the coding layer, and inputting the image features to a decoding layer (Decoder layer) of the first image generation model to obtain a target medical image output by the decoding layer; further, the encoding layer includes a plurality of convolution layers to extract each level of image features of the medical image. For example, the first image generation model may be built based on a CycleGAN model or a Pix2Pix model or a UNIT model.

The first image generation model is obtained through training based on the sample medical image corresponding to the current mode and the sample target medical image corresponding to the target mode. The mode of the sample medical image is the current mode, and the mode of the sample target medical image is the target mode.

Here, the sample medical image and the sample target medical image are images of the same site. The sample medical image and the sample target medical image may be paired images or unpaired images. For example, if the first image generation model is used to translate MRI into CT images, the sample medical image may be brain MRI of 24 brain tumor patients, and the sample target medical image may be brain CT image of 24 brain tumor patients; wherein the samples of 18 patients are used as training sets, and the samples of 6 patients are used as test sets; further, the specimen medical image is a 3-dimensional T1 weighted imaging.

Wherein the loss function of the first image generation model is determined based on a generator loss function and a frequency decomposition loss function.

In an embodiment, the loss function of the first image generation model is determined based on a sum of the generator loss function and the frequency decomposition loss function. For example, the loss function of the first image generation model is as follows:

；

In the method, in the process of the invention,generating a loss function of the model for the first image, < >>Generator loss function,/->Is a frequency decomposition loss function.

In another embodiment, the loss function of the first image generation model is determined based on a weighted aggregation of the generator loss function and the frequency decomposition loss function. For example, the loss function of the first image generation model is as follows:

；

in the method, in the process of the invention,generating a loss function of the model for the first image, < >>Generator loss function,/->For the frequency decomposition loss function +.>Weights corresponding to generator loss function, +.>Weights corresponding to the frequency decomposition loss function.

Wherein the generator loss function is determined based on a first discrimination result of a first prediction target medical image obtained by inputting the sample medical image to the first image generation model and a second discrimination result of the sample target medical image.

Specifically, a first prediction target medical image is input to an image discrimination model corresponding to a target mode to obtain a first discrimination result output by the image discrimination model, and a sample target medical image is input to the image discrimination model to obtain a second discrimination result output by the image discrimination model. Inputting the first predicted target medical image to a discriminator corresponding to the target mode to obtain a first discrimination result corresponding to the target mode; and inputting the sample target medical image to a discriminator corresponding to the target mode to obtain a second discrimination result corresponding to the target mode. Further, generator loss functions may also be used to train the image discrimination model.

Wherein the frequency decomposition loss function is determined based on a low-frequency component difference determined based on a difference of a low-frequency component prediction image and a low-frequency component of the first prediction target medical image, and a high-frequency component difference determined based on a difference of a high-frequency component prediction image and a high-frequency component of the first prediction target medical image, the low-frequency component prediction image being obtained by inputting the low-frequency component of the sample medical image to the first image generation model, the high-frequency component prediction image being obtained by inputting the high-frequency component of the sample medical image to the first image generation model.

In some embodiments, the frequency decomposition loss function is determined based on a sum of the low frequency component difference and the high frequency component difference. In other embodiments, the frequency decomposition loss function is determined based on a weighted aggregation of low frequency component differences and high frequency component differences.

Here, the difference of the low frequency component predicted image and the low frequency component of the first prediction target medical image may be determined by an L1 norm, an L2 norm, or the like. The low frequency component of the first prediction target medical image is an image of the spatial domain.

Here, the difference of the high frequency component predicted image and the high frequency component of the first prediction target medical image may be determined by an L1 norm, an L2 norm, or the like. The high frequency component of the first prediction target medical image is an image of the spatial domain.

Specifically, inputting a low-frequency component of a sample medical image into a first image generation model to obtain a low-frequency component prediction image output by the first image generation model; and inputting the high-frequency component of the sample medical image into the first image generation model to obtain a high-frequency component prediction image output by the first image generation model.

Illustratively, the frequency decomposition loss function is as follows:

；

in the method, in the process of the invention,decomposing the loss function for the frequency; />For a sample medical image>Is a spatial domain；/>For the low frequency component of the sample medical image, +.>Predicting an image for the low frequency component; />For a first prediction target medical image, +.>A low frequency component for the first predicted target medical image;is a low frequency component difference; />High for sample medical imagesFrequency component->Predicting an image for the high frequency component; />A high frequency component for the first predicted target medical image; />Is the high frequency component difference.

According to the medical image generation method provided by the embodiment of the invention, the loss function of the first image generation model for the conversion mode is determined based on the generator loss function and the frequency decomposition loss function, and the frequency decomposition loss function is determined based on the low-frequency component difference and the high-frequency component difference; the low-frequency component difference is determined based on the difference between a low-frequency component prediction image obtained by inputting a low-frequency component of a sample medical image into a first image generation model and a low-frequency component of a first prediction target medical image, so that the first image generation model learns the low-frequency component of the sample medical image, and simultaneously restrains the low-frequency information of the generated image, thereby improving the generation accuracy of the first image generation model and further improving the generation accuracy of the multi-mode medical image; the high-frequency component difference is determined based on the difference between a high-frequency component prediction image obtained by inputting a high-frequency component of a sample medical image into a first image generation model and a high-frequency component of a first prediction target medical image, so that the first image generation model learns the high-frequency component of the sample medical image, and high-frequency information of the generated image is restrained, thereby improving the generation accuracy of the first image generation model and further improving the generation accuracy of the multi-mode medical image; by the method, the high-frequency information and the low-frequency information of the image can be focused at the same time, so that the identity characteristic and the style characteristic of the image are focused, the high-frequency information and the low-frequency information of the original image are reserved, the identity information and the style information of the generated image are reserved, and the generation accuracy of the multi-mode medical image is improved; in addition, the medical image corresponding to the current mode is input into the first image generation model to obtain the target medical image corresponding to the target mode output by the first image generation model, so that single input and single output of the model are realized under the condition that the size of the first image generation model is not increased, the stability and the robustness of the first image generation model are further improved, and finally the generation accuracy of the multi-mode medical image is improved.

Based on the above embodiment, considering that the frequency decomposition loss function is determined based on the low frequency component difference and the high frequency component difference, the first image generation model is caused to force the generated image to be equal to the high frequency component and the low frequency component of the original image, however, the constraint is unreasonable and is not applicable to medical image conversion, the first image generation model is a nonlinear processing model, the linear component of the mapped generated image is generally changed, and the high frequency component and the low frequency component of medical images of different modalities of the same organ do not meet the constraint; in other words, considering that the first image generation model has operators such as a nonlinear activation function, the operators do not satisfy homogeneity and superposition, and therefore the constraint is biased, based on this, fig. 2 is a second schematic flow chart of the medical image generation method provided by the present invention, and as shown in fig. 2, the low-frequency component of the sample medical image is determined based on the following steps:

and 210, performing nonlinear processing on the sample medical image to obtain a processed sample medical image.

Here, the non-linear processing may include, but is not limited to: nonlinear neural network processing, activation function processing, and the like. The sample medical image and the processed sample medical image are both spatial domain images.

In one embodiment, a sample medical image is input to a nonlinear neural network model to obtain a processed sample medical image output by the nonlinear neural network model.

Step 220, converting the processed sample medical image into a spectrum image.

Specifically, the processed sample medical image is converted into a spectral image belonging to the frequency domain.

In a specific embodiment, the processed sample medical image is subjected to fourier transformation, and mapped to a frequency domain to obtain a spectrum image. It should be noted that the fourier transform has linear additivity, that is, the original image is equal to the algebraic sum of its high frequency component and its low frequency component, based on which, before the fourier transform, the sample medical image is subjected to nonlinear processing, so as to solve the problem of nonlinear insufficiency.

Exemplary, for H W sample medical imagesThe discrete fourier transform is performed, and a specific transform formula is as follows:

；

in the method, in the process of the invention,for spectral image +.>Spectral information on coordinates, N () represents nonlinear processing.

And 230, performing low-pass filtering on the frequency spectrum image to obtain a low-frequency component of the frequency spectrum image.

In one embodiment, the spectral image is low-pass Gaussian filtered in the frequency domain to obtain the low-frequency component of the spectral image. Illustratively, the low-pass Gaussian filter is as follows:

；

In the method, in the process of the invention,is a low-pass Gaussian filter,>is the standard deviation of the filter +.>Is a spectrogramIs a coordinate of (b) a coordinate of (c).

Illustratively, the low-pass Gaussian filtering is as follows:

；

in the method, in the process of the invention,for the low frequency component of the spectral image +.>The spectral information on the coordinates of the spectrum,for spectral image +.>Spectral information on coordinates,/->Is at->A low-pass gaussian filter on the coordinates.

Step 240 converts the low frequency components of the spectral image into low frequency components of the sample medical image.

In particular, the low frequency components of the spectral image are converted into low frequency components of a sample medical image belonging to the spatial domain.

In a specific embodiment, the low frequency component of the spectral image is subjected to an inverse fourier transform, and mapped to the spatial domain, resulting in a low frequency component of the sample medical image.

According to the medical image generation method provided by the embodiment of the invention, aiming at the problem of insufficient nonlinearity, nonlinear processing is carried out on a sample medical image before the sample medical image is converted into a frequency spectrum image, so that the problem of insufficient nonlinearity is solved, the generation accuracy of a first image generation model is further improved, and the generation accuracy of a multi-mode medical image is further improved.

Based on any one of the above embodiments, the method in step 210 includes:

Here, the activation function may include, but is not limited to: a ReLU function and a Sigmoid function, etc.

In one embodiment, the convolutional neural network model comprises a plurality of convolutional neural network layers, and each convolutional neural network layer comprises a convolutional layer, a pooling layer and an activation layer which are sequentially connected. Preferably, the convolutional neural network model comprises three convolutional neural network layers.

Illustratively, the processed sample medical image is as follows:

；

wherein,representing a sample medical image at +.>Image information on coordinates, N () represents the nonlinear processing of the convolutional neural network model.

According to the medical image generation method provided by the embodiment of the invention, aiming at the problem of insufficient nonlinearity, before the sample medical image is converted into the frequency spectrum image, the sample medical image is subjected to nonlinear processing by using the activation function, so that the problem of insufficient nonlinearity is solved, the generation accuracy of the first image generation model is further improved, and the generation accuracy of the multi-mode medical image is further improved.

Based on any one of the above embodiments, the method further includes the step 240:

Specifically, the low frequency component of the sample medical image may be determined directly based on the absolute value, or the absolute value may be further processed to obtain the low frequency component of the sample medical image.

In a specific embodiment, the low frequency component of the spectrum image is subjected to inverse fourier transform, and mapped to the spatial domain, so as to obtain the low frequency component of the spatial domain image.

Illustratively, the determination formula for the low frequency component of the sample medical image is as follows:

；

in the method, in the process of the invention,for the low frequency component of the sample medical image +.>Image information on the sample medical image, the low frequency component of the sample medical image is H W image,/L->For the low frequency component of the spectral image +.>Spectral information on coordinates.

According to the medical image generation method provided by the embodiment of the invention, the low-frequency component of the sample medical image is generated based on the absolute value of the low-frequency component of the spatial domain image, so that the abnormal pixel value generated by computer errors is eliminated, and the generation accuracy of the multi-mode medical image is further improved; for example, the conversion to a positive number may be performed when there are negative pixel values in the low frequency component of the spatial domain image, thereby avoiding zeroing it out. Meanwhile, aiming at the problem of insufficient nonlinearity, absolute values are used for nonlinear processing, and the problem of insufficient nonlinearity is solved, so that the generation accuracy of the multi-mode medical image is further improved.

Based on any of the above embodiments, in view of the above problem of non-linearity deficiency, the high frequency component of the sample medical image is determined based on the following steps:

performing nonlinear processing on the sample medical image to obtain a processed sample medical image; converting the processed sample medical image into a spectral image; high-pass filtering is carried out on the spectrum image to obtain a high-frequency component of the spectrum image; the high frequency components of the spectral image are converted into high frequency components of the sample medical image.

It should be noted that, the specific determination process of the high frequency component of the sample medical image may refer to the determination process of the low frequency component of the sample medical image, which is not described herein.

In one embodiment, the spectral image is high-pass Gaussian filtered in the frequency domain to obtain the high-frequency component of the spectral image. Illustratively, the high pass Gaussian filter is as follows:

；

in the method, in the process of the invention,is a high-pass Gaussian filter,>is a low-pass gaussian filter.

Illustratively, the high-pass Gaussian filtering is as follows:

；

in the method, in the process of the invention,for the high frequency component of the spectral image +.>The spectral information on the coordinates of the spectrum,for spectral image +.>Spectral information on coordinates,/- >Is at->A high-pass gaussian filter on coordinates.

In an embodiment, the high frequency components of the spectral image are converted into high frequency components of the spatial domain image; the high frequency component of the sample medical image is generated based on the absolute value of the high frequency component of the spatial domain image.

In a specific embodiment, the high frequency component of the spectrum image is subjected to inverse fourier transform, and mapped to the spatial domain, so as to obtain the high frequency component of the spatial domain image.

；/>

in the method, in the process of the invention,for the high frequency component of the sample medical image +.>Image information on the sample medical image, the high frequency component of the sample medical image is H W image,/I>For the high frequency component of the spectral image +.>Spectral information on coordinates.

It can be understood that, based on the absolute value of the high-frequency component of the spatial domain image, the high-frequency component of the sample medical image is generated, so that abnormal pixel values generated by computer errors are removed, and the generation accuracy of the multi-mode medical image is further improved; for example, the high frequency component of the spatial domain image may be converted to a positive number when there are negative pixel values, thereby avoiding zeroing it out. Meanwhile, aiming at the problem of insufficient nonlinearity, absolute values are used for nonlinear processing, and the problem of insufficient nonlinearity is solved, so that the generation accuracy of the multi-mode medical image is further improved.

Based on any of the above embodiments, in view of the above problem of non-linearity deficiency, the high frequency component of the first prediction target medical image is determined based on the steps of:

performing nonlinear processing on the first prediction target medical image to obtain a processed first prediction target medical image; converting the processed first prediction target medical image into a spectrum image; high-pass filtering is carried out on the spectrum image to obtain a high-frequency component of the spectrum image; the high frequency components of the spectral image are converted into high frequency components of the first prediction target medical image.

It should be noted that, the specific determination process of the high frequency component of the first prediction target medical image may refer to the determination process of the high frequency component of the sample medical image, which is not described herein.

Based on any of the above embodiments, in view of the above problem of non-linearity deficiency, the low frequency component of the first prediction target medical image is determined based on the steps of:

performing nonlinear processing on the first prediction target medical image to obtain a processed first prediction target medical image; converting the processed first prediction target medical image into a spectrum image; performing low-pass filtering on the spectrum image to obtain a low-frequency component of the spectrum image; the low frequency components of the spectral image are converted into low frequency components of the first prediction target medical image.

It should be noted that, the specific determination process of the low frequency component of the first prediction target medical image may refer to the determination process of the low frequency component of the sample medical image, which is not described herein.

Based on any of the above embodiments, fig. 3 is a third flow chart of the medical image generating method according to the present invention, as shown in fig. 3, where the first image generating model is trained based on the following steps:

step 310, inputting the sample medical image to the first image generation model, and obtaining a first prediction target medical image corresponding to the target mode output by the first image generation model.

Step 320, inputting the first predicted target medical image to an image discrimination model corresponding to the target mode, obtaining the first discrimination result output by the image discrimination model, and inputting the sample target medical image to the image discrimination model, obtaining the second discrimination result output by the image discrimination model.

It will be appreciated that the first image generation model and the image discrimination model are optimized by the continual antagonism of the image discrimination model (discriminator) with the first image generation model (generator).

Step 330, determining a third discrimination result based on the difference between the preset normalization value and the second discrimination result.

Here, the preset normalization value may be set according to actual needs, and preferably, the preset normalization value is 1.

For example, the determination formula of the third discrimination result is as follows:

；/>

in the method, in the process of the invention,for the third discrimination result, presetting the normalized value to be 1 +.>For a sample target medical image +.>Is the firstAnd judging the result.

Step 340, determining the generator loss function based on the third discrimination result and the first discrimination result.

In some embodiments, the generator loss function is determined based on a sum of squares of the third discrimination result and the first discrimination result. The sum of squares can be directly used as a generator loss function; further data processing may also be performed on the sum of squares to obtain the generator loss function. Further, the sum of squares may also be a weighted sum of squares.

Illustratively, the generator loss function is as follows:

；

in the method, in the process of the invention,generating a generator loss function of the model for the first image, preset normalized value of 1,/for the first image>For a sample target medical image +.>For the second discrimination result, < >>For a sample medical image >For a first prediction target medical image, +.>Is the first discrimination result.

In other embodiments, the sum of the third and first discrimination results may be directly used as the generator loss function; further data processing of the sum may also be performed to obtain a generator loss function.

Step 350, training the first image generation model based on the generator loss function and the frequency decomposition loss function.

According to the medical image generation method provided by the embodiment of the invention, the generator loss function is determined in the mode, so that support is provided for training of the first image generation model, and generation of the multi-mode medical image is realized.

Based on any of the above embodiments, the loss function of the first image generation model is determined based on a generator loss function, a frequency decomposition loss function, and a consistency loss function. Further, the consistency loss function can also be used for optimizing an image discrimination model corresponding to the target modality.

Wherein the consistency loss function is determined based on a first difference of a first predictive medical image and the sample medical image, the first predictive medical image being obtained by inputting the first predictive target medical image into a second image generation model. The first difference may be determined by an L1 norm, an L2 norm, or the like.

Specifically, the first difference may be directly determined as the consistency loss function, or further data processing may be performed on the first difference to obtain the consistency loss function.

Illustratively, the consistency loss function is as follows:

；

in the method, in the process of the invention,for consistency loss function, +.>For a sample medical image>For a first prediction target medical image, +.>Is a first predictive medical image. Based on the followingThe first difference may calculate a voxel level L1 penalty for the first predictive medical image and the specimen medical image.

Specifically, the first prediction target medical image is input to the second image generation model, and the first prediction medical image output by the second image generation model is obtained.

The second image generation model is used for converting the image corresponding to the target mode into the image corresponding to the current mode. The different target modalities and the current modalities correspond to different image generation models, i.e. the second image generation model is used for translating the image corresponding to the target modality into the image corresponding to the current modality, e.g. the second image generation model is used for translating the CT image into MRI.

The second image generation model is a GAN model. In a specific embodiment, the first prediction target medical image is input to an encoding layer (Encoder layer) of the second image generation model to obtain image features output by the encoding layer, and the image features are input to a decoding layer (Decoder layer) of the second image generation model to obtain the first prediction medical image output by the decoding layer. For example, the second image generation model may be built based on a CycleGAN model or a Pix2Pix model or a UNIT model.

The second image generation model is trained based on a sample target medical image corresponding to the target mode and a sample medical image corresponding to the current mode. The training process of the second image generation model is basically the same as that of the first image generation model, and will not be described in detail here.

Illustratively, the generator loss function of the second image generation model is as follows:

；

in the method, in the process of the invention,generating a generator loss function of the model for the second image, preset normalized value of 1,/for the second image>For a sample medical image>For the discrimination result of the sample medical image, +.>For a sample target medical image +.>For the second predictive medical image,/a>And judging the result as a judging result of the second predictive medical image.

According to the medical image generation method provided by the embodiment of the invention, the loss function of the first image generation model for the conversion mode is determined based on the generator loss function, the frequency decomposition loss function and the consistency loss function, and the consistency loss function is determined based on the first difference between the first predicted medical image and the sample medical image, so that the generation accuracy of the first image generation model is further improved, and the generation accuracy of the multi-mode medical image is further improved.

Based on any of the above embodiments, the consistency loss function is determined based on the first and second differences, considering that the sample medical image and the sample target medical image may be unpaired images, and that the unpaired images may not obtain a desired distribution of the sample medical image and the sample target medical image, and thus may not be trained to obtain a correct gradient; the second difference is determined based on a difference of a second predicted target medical image and the sample target medical image.

The difference of the second predicted target medical image from the sample target medical image may be determined by an L1 norm, an L2 norm, or the like.

Specifically, the difference between the second predicted target medical image and the sample target medical image may be directly determined as the second difference, or further data processing may be performed on the difference between the second predicted target medical image and the sample target medical image to obtain the second difference.

In some embodiments, the consistency loss function is determined based on a sum of the first difference and the second difference.

Illustratively, the consistency loss function is as follows:

；

in the method, in the process of the invention,for consistency loss function, +.>For a sample medical image>For a first prediction target medical image, +.>For the first predictive medical image,/a>For a sample target medical image +.>For the second predictive medical image,/a>Is a second predicted target medical image. Based on the first difference, a voxel level L1 penalty for the first predictive medical image and the sample medical image may be calculated; based on the second difference, a voxel level L1 penalty for the second predicted target medical image and the sample target medical image may be calculated.

In other embodiments, the consistency loss function is determined based on a weighted aggregation of the first difference and the second difference.

Specifically, inputting a sample target medical image into a second image generation model to obtain a second predicted medical image output by the second image generation model; and inputting the second predictive medical image into the first image generation model to obtain a second predictive target medical image output by the first image generation model.

In some embodiments, to more accurately determine the first difference and the second difference, the sample medical image and the sample target medical image of the same patient are rigidly registered to align the sample medical image and the sample target medical image. Further, resampling is carried out on the registered sample target medical image to obtain a sample target medical image with the same voxel size as the sample medical image, so that the training effect of the first image generation model is improved, and the generation accuracy of the multi-mode medical image is further improved.

In one embodiment, the method of rigid registration may be a mutual information method.

More specifically, referring to fig. 4 and 5, the consistency loss function is determined based on the following steps:

inputting the first prediction target medical image into the second image generation model to obtain a first prediction medical image corresponding to the current mode output by the second image generation model;

inputting the sample target medical image into the second image generation model to obtain a second predicted medical image corresponding to the current mode output by the second image generation model;

inputting the second predictive medical image into the first image generation model to obtain a second predictive target medical image corresponding to a target mode output by the first image generation model;

the consistency loss function is determined based on a difference of the first predictive medical image and the sample medical image, and a difference of the second predictive target medical image and the sample target medical image.

It can be understood that the determination of the consistency loss function is implemented by a bidirectional loop structure (bidirectional loop GAN structure), i.e. one loop is divided into a forward process (refer to fig. 4) and a reverse process (refer to fig. 5), where the forward process implements the generation of the current mode to the target mode, and the reverse process implements the generation of the target mode to the current mode.

The performance of the first image generation model obtained by training the consistency loss function is better. Specifically, the first image generation model was evaluated using mean square error (Mean Squared Error, MSE), peak signal to noise ratio (Peak Signal to Noise Ratio, PSNR), SSIM, and FID, which are all superior to prior art performance metrics.

According to the medical image generation method provided by the embodiment of the invention, the generation of the unpaired medical image is realized, the consistency loss function is determined based on the first difference and the second difference, and the bidirectional circulation structure is realized, so that the generation accuracy of the first image generation model is further improved, and the generation accuracy of the multi-mode medical image is further improved.

Further, referring to fig. 6, the frequency decomposition loss function is determined based on the following steps:

Inputting the sample medical image into the first image generation model to obtain a first prediction target medical image output by the first image generation model;

inputting the low-frequency component of the sample medical image into the first image generation model to obtain a low-frequency component prediction image output by the first image generation model;

inputting the high-frequency component of the sample medical image into the first image generation model to obtain a high-frequency component prediction image output by the first image generation model;

determining a low frequency component difference based on a difference of the low frequency component predicted image and the first prediction target medical image;

determining a high frequency component difference based on a difference of the high frequency components of the high frequency component prediction image and the first prediction target medical image;

the frequency decomposition loss function is determined based on the low frequency component difference and the high frequency component difference.

Based on the above embodiments, the present invention proposes a single-input single-output network (first image generation model) based on GANs and frequency domain decomposition loss, which realizes the full-frequency domain constraint of a strongly robust unpaired medical image and a paired medical image.

In order to facilitate understanding of the technical effects of the present invention, the following description is made by experimental data. The present experiment was trained and validated using a random subset of brain multi-modality MRI data (e.g., braTS 2021). Sample image sizes are allThe two selected MRI modes are a T2 weighted mode and a Flair mode, and the cycleGAN, the Pix2Pix and the UNIT are respectively used as base line models. The invention mainly carries out a comparison experiment, and the experimental results are shown in the following table: />

In the table, cyclegan+ours, uni+ours, pix2pix+ours, pix2pix+fdit+ours are the scheme of the present invention; t2 to Flair represents that the current mode is a T2 weighted mode, and the target mode is a Flair mode; flair to T2 represents that the current mode is a Flair mode and the target mode is a T2 weighted mode; MSE (mean square error ); FID (Frechet Inception Distance), the FID is determined based on the gaussian distributed Frechet distance. In the experiments of Pix2 pix+fdit+our, paired data are used, so the high frequency constraint uses the spatial domain high frequency constraint in FDIT, while the low frequency constraint uses the low frequency constraint used in the present invention. In the experiments of cyclegan+fdit, the mode collapse phenomenon occurs in the Flair to T2 direction due to instability and systematic deviation of FDIT constraint. Because the baseline model has the reconstruction constraint (airspace constraint), and the FDIT paper shows that the effect of performing the constraint only in the airspace is better than the effect of performing the constraint only in the frequency domain, the frequency domain constraint is very close to the airspace+frequency domain constraint, and meanwhile, the frequency domain constraint can bring the problem of unstable training and consume a large amount of calculation resources, so that the FDIT used in the invention only selects the airspace constraint of the transition part. It should be noted that the higher the SSIM, the better the PSNR, the lower the MSE, the better the FID, and the lower the FID.

It can be understood that the performance index of the scheme of the invention is better as can be seen from the above table.

The medical image generating apparatus provided by the present invention will be described below, and the medical image generating apparatus described below and the medical image generating method described above may be referred to correspondingly to each other.

Fig. 7 is a schematic structural diagram of a medical image generating apparatus according to the present invention, as shown in fig. 7, the medical image generating apparatus includes:

the image acquisition module 710 is configured to acquire a medical image corresponding to a current modality, and determine a target modality to be generated;

the image generation module 720 is configured to input the medical image to a first image generation model corresponding to the current modality and the target modality, and obtain a target medical image corresponding to the target modality output by the first image generation model;

According to the medical image generation device provided by the embodiment of the invention, the loss function of the first image generation model for the conversion mode is determined based on the generator loss function and the frequency decomposition loss function, and the frequency decomposition loss function is determined based on the low-frequency component difference and the high-frequency component difference; the low-frequency component difference is determined based on the difference between a low-frequency component prediction image obtained by inputting a low-frequency component of a sample medical image into a first image generation model and a low-frequency component of a first prediction target medical image, so that the first image generation model learns the low-frequency component of the sample medical image, and simultaneously restrains the low-frequency information of the generated image, thereby improving the generation accuracy of the first image generation model and further improving the generation accuracy of the multi-mode medical image; the high-frequency component difference is determined based on the difference between a high-frequency component prediction image obtained by inputting a high-frequency component of a sample medical image into a first image generation model and a high-frequency component of a first prediction target medical image, so that the first image generation model learns the high-frequency component of the sample medical image, and high-frequency information of the generated image is restrained, thereby improving the generation accuracy of the first image generation model and further improving the generation accuracy of the multi-mode medical image; by the method, the high-frequency information and the low-frequency information of the image can be focused at the same time, so that the identity characteristic and the style characteristic of the image are focused, the high-frequency information and the low-frequency information of the original image are reserved, the identity information and the style information of the generated image are reserved, and the generation accuracy of the multi-mode medical image is improved; in addition, the medical image corresponding to the current mode is input into the first image generation model to obtain the target medical image corresponding to the target mode output by the first image generation model, so that single input and single output of the model are realized under the condition that the size of the first image generation model is not increased, the stability and the robustness of the first image generation model are further improved, and finally the generation accuracy of the multi-mode medical image is improved.

Fig. 8 illustrates a physical structure diagram of an electronic device, as shown in fig. 8, which may include: processor 810, communication interface (Communications Interface) 820, memory 830, and communication bus 840, wherein processor 810, communication interface 820, memory 830 accomplish communication with each other through communication bus 840. The processor 810 may invoke logic instructions in the memory 830 to perform a medical image generation method comprising: acquiring a medical image corresponding to a current mode, and determining a target mode to be generated; inputting the medical image to a first image generation model corresponding to the current mode and the target mode to obtain a target medical image corresponding to the target mode output by the first image generation model; the first image generation model is used for converting an image corresponding to the current mode into an image corresponding to the target mode; the first image generation model is obtained by training based on a sample medical image corresponding to the current mode and a sample target medical image corresponding to the target mode; the loss function of the first image generation model is determined based on a generator loss function and a frequency decomposition loss function; the generator loss function is determined based on a first discrimination result of a first prediction target medical image obtained by inputting the sample medical image into the first image generation model and a second discrimination result of the sample target medical image; the frequency decomposition loss function is determined based on a low-frequency component difference determined based on a difference of a low-frequency component prediction image obtained by inputting a low-frequency component of the sample medical image to the first image generation model and a high-frequency component difference determined based on a difference of a high-frequency component of the first prediction target medical image from a low-frequency component of the first prediction target medical image obtained by inputting a high-frequency component of the sample medical image to the first image generation model.

Further, the logic instructions in the memory 830 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform a medical image generation method provided by the above methods, the method comprising: acquiring a medical image corresponding to a current mode, and determining a target mode to be generated; inputting the medical image to a first image generation model corresponding to the current mode and the target mode to obtain a target medical image corresponding to the target mode output by the first image generation model; the first image generation model is used for converting an image corresponding to the current mode into an image corresponding to the target mode; the first image generation model is obtained by training based on a sample medical image corresponding to the current mode and a sample target medical image corresponding to the target mode; the loss function of the first image generation model is determined based on a generator loss function and a frequency decomposition loss function; the generator loss function is determined based on a first discrimination result of a first prediction target medical image obtained by inputting the sample medical image into the first image generation model and a second discrimination result of the sample target medical image; the frequency decomposition loss function is determined based on a low-frequency component difference determined based on a difference of a low-frequency component prediction image obtained by inputting a low-frequency component of the sample medical image to the first image generation model and a high-frequency component difference determined based on a difference of a high-frequency component of the first prediction target medical image from a low-frequency component of the first prediction target medical image obtained by inputting a high-frequency component of the sample medical image to the first image generation model.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A medical image generation method, comprising:

the frequency decomposition loss function is determined based on a low-frequency component difference determined based on a difference of a low-frequency component prediction image obtained by inputting a low-frequency component of the sample medical image to the first image generation model and a high-frequency component difference determined based on a difference of a high-frequency component of the first prediction target medical image from a low-frequency component of the first prediction target medical image obtained by inputting a high-frequency component of the sample medical image to the first image generation model;

the low frequency component of the sample medical image is determined based on the steps of:

Converting the processed sample medical image into a spectral image;

2. The medical image generation method according to claim 1, wherein the performing nonlinear processing on the sample medical image to obtain a processed sample medical image includes:

3. The medical image generation method according to claim 1, wherein the converting the low frequency component of the spectral image into the low frequency component of the sample medical image comprises:

4. The medical image generation method according to claim 1, wherein the first image generation model is trained based on the steps of:

5. The medical image generation method of claim 1, wherein the loss function of the first image generation model is determined based on a generator loss function, a frequency decomposition loss function, and a consistency loss function;

6. The medical image generation method of claim 5, wherein the consistency loss function is determined based on the first and second differences;

7. A medical image generation apparatus, comprising:

Converting the processed sample medical image into a spectral image;

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the medical image generation method according to any one of claims 1 to 6 when executing the program.

9. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the medical image generation method according to any one of claims 1 to 6.