WO2023020198A1 - 用于医学图像的图像处理方法、装置、设备及存储介质 - Google Patents

用于医学图像的图像处理方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023020198A1
WO2023020198A1 PCT/CN2022/107341 CN2022107341W WO2023020198A1 WO 2023020198 A1 WO2023020198 A1 WO 2023020198A1 CN 2022107341 W CN2022107341 W CN 2022107341W WO 2023020198 A1 WO2023020198 A1 WO 2023020198A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
predicted
sample
network
image processing
Prior art date
Application number
PCT/CN2022/107341
Other languages
English (en)
French (fr)
Inventor
林一
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2023020198A1 publication Critical patent/WO2023020198A1/zh
Priority to US18/132,824 priority Critical patent/US20230245426A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the present application relates to the field of medical technology, in particular to an image processing method, device, device and storage medium for medical images.
  • medical images are usually input into a neural network model, and medical image segmentation is performed based on medical image features extracted by the neural network, so as to obtain medical image segmentation results.
  • the neural network model in the above-mentioned technologies often focuses on the medical image features and the strong expressive features of the input medical images, and pays less attention to the weak expressive features in the medical images, so that the information contained in the obtained medical image segmentation results It is not comprehensive, making the medical image segmentation effect poor.
  • Embodiments of the present application provide an image processing method, device, device, and storage medium for medical images.
  • the technical solution is as follows:
  • an image processing method for medical images is provided, the method is executed by a computer device, and the method includes:
  • the first sample image is the first modality of the target medical object sample medical images of
  • the generation network Invoking the generation network in the image processing model to generate a predicted generation image based on the first feature map; the predicted generation image is a predicted image of a second modality of the first sample image;
  • the image processing model is trained based on the difference between the predicted segmented image and the label image, the difference between the predicted generated image and a second sample image; the second sample image is the target medical object The sample medical image of the second modality; the label image corresponds to the target medical object and is used to indicate at least one image of a specified type of region.
  • an image processing device for medical images comprising:
  • the first encoding module is used to call the first encoding network in the image processing model, encode the first sample image, and obtain the first feature map of the first sample image;
  • the first sample image is the target a sample medical image of the first modality of the medical object;
  • a decoding module configured to call the decoding network in the image processing model, perform decoding based on the first feature map, and obtain a predicted segmented image of the first sample image; the predicted segmented image is used to indicate the predicted at least one zone of the specified type;
  • a generating module configured to call a generating network in the image processing model, and generate a predicted generated image based on the first feature map; the predicted generated image is a predicted image of a second modality of the first sample image;
  • a model training module configured to train the image processing model based on the difference between the predicted segmented image and the label image, the difference between the predicted generated image and the second sample image;
  • the second The sample image is a sample medical image of the second modality of the target medical object;
  • the label image is an image corresponding to the target medical object and used to indicate at least one specified type of region.
  • the model training module includes:
  • a first determination submodule configured to determine a function value of a first loss function based on the difference between the predicted segmentation image and the label image;
  • a second determining submodule configured to determine a function value of a second loss function based on the difference between the predicted generated image and the second sample image
  • the model training submodule is configured to train the image processing model based on the function value of the first loss function and the function value of the second loss function.
  • the model training submodule is configured to update parameters of the first encoding network and parameters of the decoding network based on a function value of the first loss function;
  • the parameters of the first encoding network and the parameters of the generation network are updated.
  • the first determining submodule includes:
  • a first determination unit configured to determine a function value of a first branch function of the first loss function based on the similarity between the predicted segmentation image and the label image;
  • the second determination unit is configured to determine a second value of the first loss function based on the position of at least one region of a specified type predicted in the predicted segmented image and the position of at least one region of a specified type in the label image. the function value of the branch function;
  • a third determining unit configured to determine the function value of the first loss function based on the function value of the first branch function and the function value of the second branch function.
  • the first determination unit is configured to obtain weight values corresponding to each divided area in the predicted segmented image; each divided area in the predicted segmented image includes the at least a specified type of area;
  • the device further includes:
  • a discrimination module configured to invoke a discriminator to discriminate the predicted generated image, and obtain a discrimination result of the predicted generated image
  • a third determining module configured to determine a function value of a third loss function based on the discrimination result; the discrimination result is used to indicate whether the predicted generated image is a real image;
  • the model training module is configured to train the image processing model based on the function value of the first loss function, the function value of the second loss function and the function value of the third loss function.
  • the first encoding network includes N encoding layers, and the N encoding layers are connected in pairs, N ⁇ 2, and is a positive integer;
  • the first coding module includes:
  • a set acquisition submodule configured to acquire a first image pyramid of the first sample image, where the first image pyramid is an image set obtained by downsampling the first sample image according to a specified gradient, and the first The image pyramid includes N first images to be processed;
  • An encoding submodule configured to input the N first images to be processed into corresponding encoding layers, encode the N first images to be processed, and obtain the Nth images of the first sample image a feature map;
  • the input of the target coding layer in response to the target coding layer being the non-first coding layer in the N coding layers, the input of the target coding layer also includes the first feature map output by the previous coding layer.
  • the decoding network in the image processing model includes N decoding layers, and the N decoding layers are connected in pairs, and the N decoding layers and the N encoding layers are one by one correspond;
  • the decoding module includes:
  • the decoding sub-module is configured to input the N first feature maps into corresponding decoding layers, decode the N first feature maps, and obtain N decoding results; the N decoding results have the same resolution;
  • a merging submodule configured to combine the N decoding results to obtain a predicted segmented image of the first sample image
  • the input of the target decoding layer in response to the fact that the target decoding layer is not the first decoding layer among the N decoding layers, the input of the target decoding layer also includes the decoding result output by the previous decoding layer.
  • the device further includes:
  • An image acquisition module configured to acquire a prior constrained image of the image processing model based on a third sample image;
  • the third sample image is a sample medical image of a third modality of the target medical object; the prior constraining the image to indicate the position of the target medical object in the third sample image;
  • the second encoding module is configured to call a second encoding network in the image processing model, perform encoding based on the prior constraint image, and obtain a second feature map of the third sample image;
  • a merging module configured to merge the first feature map and the second feature map to obtain a comprehensive feature map
  • the decoding module is configured to call the decoding network in the image processing module, perform decoding based on the comprehensive feature map, and obtain the predicted segmented image of the first sample image;
  • the generation module is configured to call a generation network in the image processing model to generate the predicted generation image based on the integrated feature map.
  • the device further includes:
  • a cropping module configured to crop the prior constrained image based on the position of the target medical object
  • the second encoding module is configured to invoke a second encoding network in the image processing model to encode the cropped prior constrained image to obtain a second feature map of the third sample image.
  • the image acquisition module is configured to invoke a semantic segmentation network to process the third sample image, and acquire a priori constrained image of the image processing model.
  • the parameters in the second coding network share the parameter weights in the first coding network.
  • a computer device in another aspect, includes a processor and a memory, at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to realize the above-mentioned application in medical science. Image processing methods for images.
  • a computer-readable storage medium wherein at least one computer program is stored in the computer-readable storage medium, and the computer program is loaded and executed by a processor to implement the above-mentioned image processing method for medical images .
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the image processing method for medical images provided in the various optional implementation manners above.
  • the image processing method for medical images obtains a multi-modal sample medical image of a target medical object, and a label image of the target medical image containing a specified type of region label, based on the multi-modal sample medical image
  • the first sample image in generates a predicted segmented image, and the predicted generated image, based on the difference between the predicted segmented image and the label image, the difference between the predicted generated image and the second sample image of the target medical object, for the first
  • the encoding network, the decoding network and the image processing model of the generating network are trained, so that the training image processing model can obtain the characteristics of multi-modal medical images based on a single modality medical image, so that the obtained medical image segmentation results include
  • the information is more comprehensive, which improves the segmentation effect of medical images;
  • the image processing model obtained based on training can generate medical images of other modalities based on a single modality of medical images, thereby solving the problem of missing images in the process of medical image analysis.
  • FIG. 1 shows a schematic diagram of a system architecture of an image processing method for medical images provided by an exemplary embodiment of the present application
  • FIG. 2 shows a flowchart of an image processing method for medical images provided by an exemplary embodiment of the present application
  • Fig. 3 is a frame diagram showing generation of an image processing model and image processing according to an exemplary embodiment
  • FIG. 4 shows a flowchart of an image processing method for medical images provided by an exemplary embodiment of the present application
  • Fig. 5 shows a flowchart of an image processing method for medical images shown in an exemplary embodiment of the present application
  • Fig. 6 shows a schematic diagram of synthesis of approximate marks shown in an exemplary embodiment of the present application
  • Fig. 7 shows a schematic structural diagram of an image processing model shown in an exemplary embodiment of the present application.
  • FIG. 8 shows a schematic structural diagram of a coding layer shown in an exemplary embodiment of the present application.
  • FIG. 9 shows a schematic structural diagram of a decoding layer shown in an exemplary embodiment of the present application.
  • Fig. 10 shows a schematic diagram of an application process of an image processing model shown in an exemplary embodiment of the present application
  • Fig. 11 shows a block diagram of an image processing device for medical images shown in an exemplary embodiment of the present application
  • Fig. 12 shows a structural block diagram of a computer device shown in an exemplary embodiment of the present application
  • Fig. 13 shows a structural block diagram of a computer device shown in an exemplary embodiment of the present application.
  • FIG. 1 shows a schematic diagram of a system architecture of an image processing method for medical images provided by an exemplary embodiment of the present application.
  • the system includes: a computer device 110 and a medical image acquisition device 120 .
  • the above-mentioned computer device 110 can be implemented as a terminal or a server.
  • the computer device 110 can be an independent physical server, or a server cluster or a distributed system composed of multiple physical servers. It can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN (Content Delivery Network, content distribution network), and big data and Cloud servers for basic cloud computing services such as artificial intelligence platforms.
  • the computer device 110 may be a smart phone, a tablet computer, a laptop computer, a desktop computer, and the like.
  • the above-mentioned medical image acquisition device 120 is a device with a medical image acquisition function.
  • the medical image acquisition device can be a CT (Computed Tomography, computerized tomography) detector for medical detection, a nuclear magnetic resonance instrument, or a positron emission computed tomography scan. Instruments, cardiac magnetic resonance instruments and other equipment with image acquisition devices, etc.
  • CT Computerized Tomography
  • CMR Cardiac Magnetic Resonance
  • Magnetic resonance is a non-invasive imaging technique; CMR images obtained based on cardiac magnetic resonance imaging can provide anatomical and functional information of the heart to assist in the clinical diagnosis and treatment of heart diseases.
  • CMR images can assist clinical diagnosis of myocardial infarction. diagnosis and treatment.
  • CMR imaging sequences Cardiac magnetic resonance imaging is a multi-modal imaging method. Different CMR imaging sequences have different imaging focuses to provide different cardiac feature information. Schematically, CMR imaging sequences may include: Balanced-Steady State Free Precession (bSSFP), which can capture heart motion, so that its corresponding bSSFP image can present a complete and clear myocardial boundary; T2-weighted imaging, its corresponding T2-weighted image can clearly It can accurately display myocardial edema or myocardial ischemic injury.
  • bSSFP Balanced-Steady State Free Precession
  • T2-weighted images can display myocardial edema or myocardial ischemic injury in a highlighted form; Late Gadolinium Enhancement (LGE) technology, the corresponding LGE image can Highlights myocardial scar or myocardial infarction area.
  • LGE Late Gadolinium Enhancement
  • the multimodal medical images shown in this application may be based on medical images corresponding to the same medical object acquired by different medical image acquisition devices.
  • the multimodal medical images may include T1 weighted images, T2 Medical images such as weighted images and CT images.
  • the above system includes one or more computer devices 110 and one or more medical image acquisition devices 120 .
  • the embodiment of the present application does not limit the number of computer devices 110 and medical image acquisition devices 120 .
  • the medical image acquisition device 120 and the computer device 110 are connected through a communication network.
  • the communication network is a wired network or a wireless network.
  • the aforementioned wireless network or wired network uses standard communication technologies and/or protocols.
  • the network is usually the Internet, but can be any network, including but not limited to Local Area Network (LAN), Metropolitan Area Network (MAN), Wide Area Network (WAN), mobile, wired or wireless Any combination of network, private network, or virtual private network.
  • data exchanged over a network is represented using technologies and/or formats including Hyper Text Mark-up Language (HTML), Extensible Markup Language (XML), and the like.
  • HTML Hyper Text Mark-up Language
  • XML Extensible Markup Language
  • Fig. 2 shows a flow chart of an image processing method for medical images provided by an exemplary embodiment of the present application, the method is executed by a computing device, and the computer device can realize the server shown in Fig. 1, as shown in Fig. 2 , the image processing method for medical images comprises the following steps:
  • Step 210 call the first encoding network in the image processing model, encode the first sample image, and obtain the first feature map of the first sample image; the first sample image is the first modality of the target medical object Sample medical images of .
  • the specified type of region label can be used to represent the lesion information in the first sample image, and the lesion information can include the position and shape of the lesion in the first sample image, etc. information.
  • the sample image (including the first sample image and the second sample image) in the embodiment of the present application may be a medical image acquired by a medical image acquisition device, such as the medical image acquisition device shown in FIG. 1; or, the sample image It can be obtained based on the medical image data stored in the database.
  • the sample images involved in the embodiment of the present application can be obtained based on the image data in the public data set MyoPS20, which consists of multi-sequence myocardial case CMR , including bSSFP images, T2-weighted images, and LGE images of 45 cases, 25 of which were labeled.
  • bSSFP images were used for each patient's raw CMR sequence.
  • T2-weighted images consisted of 3–7 slices with an in-plane resolution of 1.35 ⁇ 1.35 mm and a slice thickness of 12–20 mm.
  • LGE images have 10–18 slices with an in-plane resolution of 0.75 ⁇ 0.75 mm and a slice thickness of 5 mm. The above images are aligned to a common space and resampled to the same spatial resolution to obtain the sample images in this application.
  • the label image of the first sample image may contain other lesion regions with lower resolution in the first sample image.
  • the first sample image is a T2-weighted image of a target medical object as an example, where the resolution The higher lesion area is the myocardial edema area. If the target medical object of the T2-weighted image corresponds to a myocardial scar, then the label image of the T2-weighted image can include the myocardial scar area label in addition to the myocardial edema area label .
  • the label image of the LGE image includes not only the label of the myocardial scar area, but also The label of the myocardial edema region may be included; that is, the label images corresponding to the medical images of different modalities of the same target medical object are the same.
  • the modality of the sample medical image is used to indicate the acquisition method of the medical image.
  • the sample medical image of the first modality may be a T2-weighted image, or the sample medical image of the first modality may also be a LEG image, or , the sample medical image of the first modality may also be a medical image collected by any other medical image collection method.
  • Step 220 call the decoding network in the image processing model, decode based on the first feature map, and obtain the predicted segmented image of the first sample image; the predicted segmented image is used to indicate at least one predicted region of a specified type.
  • the number of predicted regions of the specified type in the predicted segmented image is equal to the number of regions of the specified label in the label image.
  • the predicted specified type area may be the lesion area in the first sample image predicted based on the processing by the first encoding network and the decoding network.
  • Step 230 calling the generation network in the image processing model to generate a predicted generated image based on the first feature map; the predicted generated image is a predicted image of the second modality of the first sample image.
  • the first modality to which the first sample image belongs is different from the second modality to which the predicted generated image belongs.
  • Step 240 based on the difference between the predicted segmented image and the label image, the difference between the predicted generated image and the second sample image, and training the image processing model;
  • the second sample image is the second modality of the target medical object A sample medical image;
  • the label image corresponds to the target medical object and is used to indicate an image of at least one specified type of region.
  • the computer device obtains different sample medical images of the first modality as the first sample image, iteratively executes the above steps 210 to 240, and predicts and generates The difference between the image and the second sample image is to iteratively update the parameters in the image processing model until the training completion condition is reached.
  • the training completion condition includes: the image processing model converges, the number of iterations reaches the number threshold, and so on.
  • the image processing model after training can be used to perform medical image segmentation on the input target medical image of the first modality, obtain a specified type of region in the target medical image, and/or generate a second mode of the target medical image state of the art medical images.
  • the image processing method for medical images acquires multi-modal sample medical images of target medical objects, and label images corresponding to target medical images containing specified type area labels, based on
  • the first sample image in the multi-modal sample medical image generates a predicted segmentation image
  • the predicted generated image based on the difference between the predicted segmented image and the label image, predicts the difference between the generated image and the second sample image of the target medical object
  • the image processing model including the first encoding network, decoding network and generating network is trained, so that the image processing model obtained by training can obtain the characteristics of multi-modal medical images based on a single modality medical image, so that The information contained in the obtained medical image segmentation results is relatively comprehensive, which improves the segmentation effect of medical images;
  • the image processing model obtained based on training can generate medical images of other modalities based on a single modality of medical images, thereby solving the problem of missing images in the process of medical image analysis.
  • the image processing model is obtained by training the multimodal medical sample images of the same target medical object and the label image corresponding to the target medical object, which can improve the medical image segmentation effect of the image processing model, And, to solve the problem of missing images in the process of medical image analysis.
  • the application scenarios of the above solutions include but are not limited to the following scenarios:
  • CMR cardiac magnetic resonance
  • Different imaging sequences can image and provide information on different characteristics of the heart, including delayed gadolinium-enhanced (LGE) images showing areas of myocardial infarction, T2-weighted images that highlight myocardial edema or myocardial ischemic damage, and the ability to capture cardiac motion and Balanced Steady-state Free Precession (bSSFP) sequence images with clear boundaries, these multi-sequence CMR images can provide rich and reliable information about myocardial pathology and morphology, and help doctors in diagnosis and treatment planning .
  • LGE delayed gadolinium-enhanced
  • T2-weighted images that highlight myocardial edema or myocardial ischemic damage
  • bSSFP Balanced Steady-state Free Precession
  • the respective image processing models of the T2-weighted image and the LGE image obtained based on the image processing method for medical images provided by the embodiment of the present application that is, the sample medical image of the T2-weighted modality is acquired as the first The image processing model obtained by sample image training, and the image processing model obtained by obtaining the sample medical image of the LGE modality as the first sample image training, and inputting the T2 weighted image into the image processing model of the T2 weighted modality, Obtain a segmented image containing myocardial scar area and myocardial edema
  • image processing models corresponding to the above-mentioned scenarios can be obtained through the image processing method for medical images provided by this application to determine the position and shape of the lesion in the organ, for example, to determine the location and shape of the gastric ulcer.
  • the location, shape and size of the lesions in the stomach so that medical staff can allocate medical resources based on the location, shape and size of the lesions; therefore, the image processing model obtained based on the image processing method for medical images provided by this application , can improve the accuracy of medical image segmentation, and can further improve the accuracy of lesion judgment, so as to realize the reasonable allocation of medical resources.
  • Fig. 3 is a frame diagram showing image processing model generation and image processing according to an exemplary embodiment.
  • the image processing model generation device 310 uses pre-set training samples The data set (including the sample medical image of the first modality and the label image of the target medical object corresponding to the sample medical image) is used to obtain an image processing model; then, an image processing model is generated based on the image processing model.
  • the image processing device 320 processes the input target medical image of the first modality based on the image processing model, and obtains the image segmentation result of the target medical image of the first modality, in which the image segmentation result can be Including the medical object corresponding to the target medical image, which can be obtained in the medical image of multiple modalities, such as determining the position and shape of at least one lesion in the medical object corresponding to the target medical image; and/or , process the input target medical image of the first modality, obtain the image generation result of the target medical image of the first modality, and generate a second modality medical image of the medical object corresponding to the target medical image of the first modality , to solve the problem of missing images and make the image segmentation results interpretable.
  • the first encoding network and decoding network in the image processing model when applying the image processing model, if image segmentation of the target medical image is required, the first encoding network and decoding network in the image processing model can be used, or based on the The first encoding network and decoding network of the image segmentation model can be reconstructed, and the parameters in the image segmentation model are consistent with the parameters of the first encoding network and decoding network in the image processing model; in another possible implementation , when applying the image processing model, if it is necessary to generate the corresponding medical image of the second modality based on the input target medical image of the first modality, the first coding network and the generating network in the image processing model can be used, Alternatively, based on the first encoding network and generating network in the image processing model, an image generating model can be reconstructed, and the parameters in the image generating model are consistent with the parameters of the first encoding network and generating network in the image processing model.
  • the above-mentioned image processing model generation device 310 and image processing device 320 may be computer devices, for example, the computer devices may be fixed computer devices such as personal computers and servers, or the computer devices may also be tablet computers, e-book readers, etc. devices such as mobile devices.
  • the image processing model generation device 310 and the image processing device 320 may be the same device, or the image processing model generation device 310 and the image processing device 320 may also be different devices.
  • the image processing model generation device 310 and the image processing device 320 may be the same type of device, for example, the image processing model generation device 310 and the image processing device 320 may Both are servers; or the image processing model generating device 310 and the image processing device 320 may also be different types of devices, for example, the image processing device 320 may be a personal computer or a terminal, while the image processing model generating device 310 may be a server or the like.
  • the embodiment of the present application does not limit specific types of the image processing model generating device 310 and the image processing device 320 .
  • Fig. 4 shows a flowchart of an image processing method for medical images provided by an exemplary embodiment of the present application, the method is executed by a computing device, and the computer device can be implemented as the server shown in Fig. 1 , as shown in Fig. 4 As shown, the image processing method for medical images includes the following steps:
  • Step 410 call the first encoding network in the image processing model, encode the first sample image, and obtain the first feature map of the first sample image; the first sample image is the first modality of the target medical object Sample medical images of .
  • Step 420 calling the decoding network in the image processing model to perform decoding based on the first feature map to obtain a predicted segmented image of the first sample image; the predicted segmented image is used to indicate at least one predicted region of a specified type.
  • Step 430 calling the generation network in the image processing model to generate a predicted generated image based on the first feature map; the predicted generated image is a predicted image of the second modality of the first sample image.
  • Step 440 Determine the function value of the first loss function based on the difference between the predicted segmented image and the label image.
  • the function value of the first branch function of the first loss function is determined
  • a function value of the first loss function is determined.
  • the above-mentioned process of determining the function value of the first branch function of the first loss function based on the similarity between the predicted segmented image and the label image may include: Corresponding weight values; each divided area in the predicted segmented image contains at least one specified type area; based on the weight values corresponding to each divided area in the predicted segmented image, and each divided area in the predicted segmented image and the label image The similarity of each divided region determines the function value of the first branch function of the first loss function.
  • each of the above-mentioned divided areas includes a specified type of area; further, each of the above-mentioned divided areas may also include at least one other type of area other than the specified type of area, such as a background area, a normal area, and so on.
  • Focal Dice loss LFDL can be used as The first branch function in the first loss function; in the first branch function, different division regions are set with different weights, so that the division regions that are difficult to segment can obtain higher weights in the segmentation process, so that the network Can focus on learning more difficult categories, the first branch function of this first loss function can be expressed as:
  • represents the weight of the division area t
  • the Dice coefficient (Dice coefficient) is a measurement function used to evaluate the similarity of two samples, and the value range is between 0 and 1. The larger the value, the more similar it is. In the embodiment of the present application, these two The similarity between samples is reflected in the similarity between the predicted segmented image and the label image, and is further reflected in the similarity between each divided region in the predicted segmented image and the divided region in the labeled image.
  • the divided areas in the predicted image can include the lesion area, the normal area and the background area. Generally speaking, the area of the background area accounts for a larger proportion of the area of the medical image, and the area of the normal area accounts for a larger proportion of the area of the medical image.
  • the area of the lesion accounts for the smallest proportion of the area of the medical image; in order to make the decoding network pay more attention to the lesion area, the weight value of the lesion area in the first branch function can be set to the largest, and the weight value of the normal area is next , the weight value of the background area is the smallest; optionally, the value of the weight value is inversely proportional to the area of each divided area in the area of the medical image; or, the value of the weight value corresponds to the type of each divided area, for example, with
  • the mean square error loss function can be used to quantify the difference between the position of at least one specified location area predicted in the predicted segmented image and the position of at least one specified type of area in the label image.
  • the first loss The second branch function of the function can be expressed as:
  • the specified type of region in the label image may include at least one of a lesion region, a normal region and a background region.
  • the sum of the function value of the first branch function and the function value of the second branch function is obtained as the function value of the first loss function; further, to balance the first branch function and the second branch function , you can set different weight values for the first branch function and the second branch function.
  • the first loss function can be expressed as:
  • represents the weight value of the second branch function relative to the first branch function; for example, the value of ⁇ may be 100.
  • Step 450 Determine the function value of the second loss function based on the difference between the predicted generated image and the second sample image; the second sample image is a sample medical image of the second modality of the target medical object.
  • the second loss function can be expressed as:
  • x identifies the predicted generated image
  • x' represents the sample medical image of the second modality of the target medical object
  • H and W represent the width and height of the predicted generated image (the sample medical image of the second modality), respectively.
  • Step 460 Train the image processing model based on the function value of the first loss function and the function value of the second loss function.
  • the first loss function and the second loss function update the parameters of different network combinations in the image processing model
  • the parameters of the first encoding network and the parameters of the decoding network are updated;
  • the parameters of the first encoding network and the parameters of the generating network are updated.
  • both the function value of the first loss function and the function value of the second loss function will guide the parameter update of the first encoding network in the image processing model; therefore, the generation network is used to assist the image segmentation model (including encoding The generation of the model of the network and the decoding network); in other words, the decoding network is used to assist the generation of the image generation model (including the model of the encoding network and the generation network).
  • a third loss function can also be introduced to train the image processing model, and the third loss function is used to indicate the authenticity of the predicted generated image; this process can be implemented as:
  • a function value of the third loss function is determined based on the discrimination result; the discrimination result is used to indicate whether the predicted generated image is a real image.
  • the third loss function can be expressed as:
  • G represents the generation network
  • D represents the discriminant network (discriminator)
  • D represents the real image distribution
  • training the image processing model includes: training the image processing model based on the function value of the first loss function, the function value of the second loss function and the function value of the third loss function.
  • the parameters of the first encoding network and the parameters of the generating network are updated.
  • the discriminator can be pre-trained; or, the parameters in the discriminator can be updated based on the function value of the third loss function, in this case, the input of the discriminator also includes the second
  • the sample medical image of the modality is used to train the discriminator; the discriminator plays an auxiliary role in the training of the generating network to improve the quality of the image generated by the generating network.
  • the image processing method for medical images acquires multi-modal sample medical images of target medical objects, and label images corresponding to target medical images containing specified type area labels, based on
  • the first sample image in the multi-modal sample medical image generates a predicted segmentation image
  • the predicted generated image based on the difference between the predicted segmented image and the label image, predicts the difference between the generated image and the second sample image of the target medical object
  • the image processing model including the first encoding network, decoding network and generating network is trained, so that the image processing model obtained by training can obtain the characteristics of multi-modal medical images based on a single modality medical image, so that The information contained in the obtained medical image segmentation results is relatively comprehensive, which improves the segmentation effect of medical images;
  • the image processing model obtained based on training can generate medical images of other modalities based on a single modality of medical images, thereby solving the problem of missing images in the process of medical image analysis.
  • the prior constraint image of the image processing model can be obtained based on the third sample image, and the prior constraint image is used to indicate the target medical object The predicted position in the sample image.
  • the third sample image may be one of the first sample image and the second sample image, or the third sample image may also be a sample medical image of the third modality of the target medical image; for cardiac magnetic resonance
  • the third sample image may be a bSSFP image capable of capturing cardiac motion and presenting a clear boundary.
  • the prediction of the position of the target medical object in the sample image is more accurate based on the prior constrained image acquired from the bSSFP image.
  • FIG. 5 shows a flowchart of an image processing method for medical images shown in an exemplary embodiment of the present application, As shown in Figure 5, the method includes the following steps:
  • Step 510 based on the third sample image, acquire the prior constraint image of the image processing model;
  • the third sample image is a sample medical image of the third modality of the target medical object;
  • the prior constraint image is used to indicate that the target medical object is in The position in the third sample image.
  • the position of the target medical object in the third sample image indicated by the prior constraint image can indicate the position of the target medical image in other sample images (including the first sample image and the second sample image).
  • the semantic segmentation network (U-Net) can be called to process the third sample image to obtain the prior constraint object of the image processing model;
  • the semantic segmentation network is obtained based on sample image set training; the sample image set contains a fourth sample image, and an approximate label of the fourth sample image, the fourth sample image refers to the third modality of other medical objects A sample medical image; the approximate mark is used to indicate the position of other medical objects in the fourth sample image.
  • the myocardial edema area and the myocardial scar area account for a relatively low proportion of the medical image, and the corresponding areas do not overlap. Therefore, the normal myocardial area can be merged , the images of the myocardial edema area and the myocardial scar area are obtained as approximate markers of medical images;
  • FIG. 6 shows a synthetic schematic diagram of approximate markers shown in an exemplary embodiment of the present application.
  • the myocardial edema area 610 and the myocardial scar area 620 are combined with the normal myocardial area 630 to generate an approximate label 640; the approximate label is obtained as the label of the first sample object, and the semantic segmentation image is trained so that the semantic segmentation image after training
  • the input third sample image may be processed to obtain a priori constrained image of the third sample image.
  • Step 520 call the second encoding network in the image processing model, encode the image based on the prior constraints, and obtain the second feature map of the third sample image.
  • the image processing model may also include a second encoding network.
  • the parameters of the second encoding network may be set to be the same as those of the first encoding network.
  • the parameters of the encoding network are kept consistent, that is, the parameters in the second encoding network share the weights of the parameters in the first encoding network.
  • the prior constrained image after obtaining the prior constrained image, can be cropped based on the position of the target medical object; after that, the second encoding in the image processing model can be invoked
  • the network encodes the cropped prior constrained image to obtain the second feature map of the third sample image.
  • Adapt to the size of the prior constraint image preprocess other sample images (including the first sample image and the second sample image), that is, crop other sample images, and ensure that the target medical object is in other image samples
  • the position in the image is similar to the position of the target medical object in the prior constraint image within the specified error range.
  • it can be ensured that the position of the target medical object in other image samples is the same as the target medical object in the prior constraint image
  • the position of is at the center of the image within the specified error range.
  • the a priori constrained image is cropped according to the center of the approximate mark obtained above; for other sample images, cropping can be performed based on the position of the specified type of area in the label image, or it can also be cropped based on the center of the above-mentioned approximate mark.
  • This application does not limit the basis for cropping other sample images.
  • the cropped prior image can be further processed, for example, applying histogram equalization and random gamma method, after uniformly setting the window level and window width Further balance the data distribution.
  • data enhancement processing may be performed on the sample image, and the data enhancement processing method includes methods such as random rotation, random cropping, and random scaling.
  • Step 530 calling the first encoding network in the image processing model to encode the first sample image to obtain the first feature map of the first sample image; the first sample image is the first modality of the target medical object Sample medical images of .
  • the first sample image is the original first sample image; if the image input into the second encoding network is a cropped priori constrained image, the first sample image This image is the cropped first sample image; that is, the size of the image input to each encoding network remains the same.
  • the image processing model in order to make the generated predicted segmented image and/or predicted generated image more accurate, can be built based on the butterfly network architecture, and the first encoding network in the butterfly network includes N Coding layer, and N coding layers are connected in pairs; the decoding network in the butterfly network includes N decoding layers, and N decoding layers are connected in pairs, and the N decoding layers in the decoding network are connected to the N in the first coding network.
  • the encoding layers There is a one-to-one correspondence between the encoding layers.
  • FIG. 7 shows a schematic structural diagram of an image processing model shown in an exemplary embodiment of the present application. As shown in FIG.
  • the first encoding network 710 includes N encoding layers
  • the decoding network 730 includes N decoding layers
  • the first encoding layer 711 in the first encoding network 710 corresponds to the Nth decoding layer 733 in the decoding network 730
  • the second encoding layer 712 in the first encoding network 710 corresponds to the Nth decoding layer 733 in the decoding network 730.
  • One decoding layer 732 corresponds one to one
  • the Nth encoding layer 713 in the first encoding network 710 corresponds to the first decoding layer 731 in the decoding network 730 one to one.
  • the image processing model includes a second coding layer 720
  • the second coding network 720 may also include N coding layers, and the N coding layers in the second coding layer are connected in pairs.
  • the foregoing N coding layers are connected in pairs, which may mean that two adjacent coding layers are connected.
  • the first encoding network in the image processing model is called to encode the first sample image, and the first sample image corresponding to the first sample image is obtained.
  • the process of feature maps can be implemented as:
  • the first image pyramid is an image collection obtained by downsampling the first sample image according to a specified gradient, and the first image pyramid includes N first images to be processed;
  • the input of the target coding layer in response to the fact that the target coding layer is not the first coding layer among the N coding layers, the input of the target coding layer also includes the first feature map output by the previous coding layer.
  • each image in the first image pyramid corresponds to a side input path, and each side input path is used to input the corresponding first image to be processed
  • the first image pyramid 750 contains N first images to be processed, and each first image to be processed corresponds to a side input path.
  • its input includes the first image to be processed input by the corresponding side input path, and the first feature map output by the previous coding layer of the coding layer.
  • the process of obtaining the second feature map includes: obtaining the cropped prior constrained image A corresponding second image pyramid, the second image pyramid is an image collection obtained by downsampling the cropped prior constrained image according to the specified gradient, and the second image pyramid includes N second images to be processed;
  • the input of the coding layer also includes the coding result output by the previous coding layer.
  • each second image to be processed in the second image pyramid corresponds to a side input path
  • each side input path is used to convert the corresponding second
  • the image to be processed is input to the corresponding encoding layer in the second encoding network, as shown in FIG. 7
  • the second image pyramid 760 includes N second images to be processed, and each second image to be processed corresponds to a side input path , for the non-first coding layer in the second coding network 720 , its input includes the second image to be processed input by the corresponding side input path, and the coding result output by the previous coding layer of this coding layer.
  • FIG. 8 shows a schematic structural diagram of the encoding layer shown in an exemplary embodiment of the present application. As shown in FIG.
  • the encoding layer in the encoding network includes a convolutional layer 810 and a convolutional layer 820, two convolutional layers In the middle, the channel attention module 830 is added in the way of residual connection; in the channel attention module 830, the feature map obtained by the convolution layer 810 is compressed in the spatial dimension by using the maximum pooling and the average pooling; the shared network is composed of multiple Layer Perceptron (Multi-Layer Perceptron, MLP), through the perception of the compressed feature map, concatenated merging, and activation function processing, to obtain the channel attention feature map; the channel attention feature map and the channel attention module The input is multiplied, and the output of the convolutional layer 810 of the encoding network is added to form a residual structure to obtain an intermediate feature map, followed by a convolutional layer 820, which is used to use a convolutional layer with a specified step size to perform an intermediate feature map.
  • MLP Layer Perceptron
  • the specified step size can be 2; optionally, for better extraction of the features of the input image, the The convolutional layers in the encoding network are replaced by depthwise separable convolutional layers.
  • Step 540 combining the first feature map and the second feature map to obtain an integrated feature map.
  • the image processing model is a model built based on a butterfly network architecture as shown in FIG. Merging result of two feature maps.
  • step 550 the decoding network in the image processing module is invoked to perform decoding based on the integrated feature map to obtain a predicted segmented image of the first sample image.
  • the decoding network in the image processing module is called, and decoding is performed based on the integrated feature map, and the process of obtaining the predicted segmented image of the first sample image can be realized for:
  • the input of the target decoding layer in response to the target decoding layer being the non-first decoding layer in the N decoding layers, the input of the target decoding layer also includes the decoding result output by the previous decoding layer.
  • the structure of the decoding layer in the decoding network can adopt the structure of the two-layer convolutional layer of "3x3 separable convolution + ReLU activation function + Dropout operation".
  • Figure 9 shows a The schematic diagram of the structure of the decoding layer shown in the exemplary embodiment, as shown in Figure 9, the decoding layer in the decoding network includes a convolutional layer 910 and a convolutional layer 920, and a residual connection is used between the two convolutional layers to add control attention Force module 930, the spatial attention module 930 mainly focuses on location information; in the spatial attention module 930, by using maximum pooling and average pooling to process on the channel dimension, a feature map is obtained, and then a convolutional layer Carry out cascading and convolution, and then process the activation function to obtain the spatial attention feature map; multiply the spatial attention feature map with the input of the spatial attention module, and add the output of the convolutional layer 910 of the decoding network to form a residual difference structure, to obtain an
  • Step 560 call the generative network in the image processing model, and generate a predicted image based on the integrated feature map.
  • the image processing model may include a generation network 740 , which is used to generate and predict a generated image 741 with the integrated feature map.
  • Step 570 training the image processing model based on the predicted difference between the segmented image and the labeled image, the predicted difference between the generated image and a second sample image;
  • the second sample image is a second modality of the target medical object The sample medical image;
  • the label image corresponds to the target medical object and is used to indicate an image of at least one specified type of region.
  • the image processing model of the butterfly network architecture provided by this application can combine deep semantic information and stratigraphic location information, so as to reduce the disappearance of gradients while ensuring network width.
  • the supervision of the input image with higher resolution can obtain more image features, and then obtain better image segmentation effect and/or image generation effect.
  • the image processing method for medical images acquires multi-modal sample medical images of target medical objects, and label images corresponding to target medical images containing specified type area labels, based on
  • the first sample image in the multi-modal sample medical image generates a predicted segmentation image
  • the predicted generated image based on the difference between the predicted segmented image and the label image, predicts the difference between the generated image and the second sample image of the target medical object
  • the image processing model including the first encoding network, decoding network and generating network is trained, so that the image processing model obtained by training can obtain the characteristics of multi-modal medical images based on a single modality medical image, so that The information contained in the obtained medical image segmentation results is relatively comprehensive, which improves the segmentation effect of medical images;
  • the image processing model obtained based on training can generate medical images of other modalities based on a single modality of medical images, thereby solving the problem of missing images in the process of medical image analysis.
  • the training results of the two image processing models can be combined to obtain the final image processing model;
  • the input of the first image processing model is the first A sample image
  • the first sample image is a sample medical image of the first modality of the target medical object, with the label image of the target medical object and the second sample image as labels
  • the first image processing model is trained to obtain The trained first image processing model
  • the second sample image is a sample medical image of the second modality of the target medical object
  • the first image processing model is used to generate a prediction segmentation of the input medical image of the first modality image, and/or, generating a medically generated image of a second modality of a medical object of a first modality input
  • the input of the second image processing model is a second sample image, a label image of a target medical object, and the first
  • the sample image is a label
  • the second image processing model is trained to obtain a trained second image processing model
  • the second image processing model is used to predict and segment images of medical images of the
  • the predicted segmentation images obtained after processing respectively based on the first image processing model and the second image repair model are the same, or the error is within the specified threshold range .
  • the parameters of the encoding network and decoding network of the first image processing model can be shared with the parameters of the encoding network and decoding network in the second image processing model; this process can be performed in the model Either during training, or after model training is complete.
  • weight sharing can be implemented as: replacing the parameters of the encoding network and decoding network in one of the image processing models with the encoding network and decoding network of another image processing model, or you can take the parameters of the two image processing models The average value of the parameters of the encoding network and the average value of the parameters of the decoding network are respectively replaced into the encoding network and the decoding network of the two image processing models.
  • the method of weight sharing in this application is not limited.
  • FIG. 10 shows the application process of the image processing model shown in an exemplary embodiment of the present application.
  • this process can be implemented in a terminal or server deployed with an image processing model, or a terminal or server deployed with an image segmentation model constructed based on the image processing model, as shown in Figure 10, based on cardiac magnetic resonance technology, the same CMR images of medical objects, bSSFP images, T2 weighted images and LGE images in Figure 10; in the first stage, bSSFP images 1010 are input into the U-Net network 1020 to obtain prior constrained images output by the U-Net network 1030, the prior constraint image is used to indicate the position information of the medical object in the CMR image; based on the center position of the medical object in the prior constraint image, the prior constraint image, T2 weighted image, is cropped with an LGE image , input the cropped T2 weighted image and the cropped prior constraint image into the first image processing model 1040 corresponding to the T2 mode, and obtain the first predicted segmented image 1050 output by the first image processing model, the first predicted segmented The image contains the position information of the myo
  • Fig. 11 shows a block diagram of an image processing device for medical images shown in an exemplary embodiment of the present application. As shown in Fig. 11, the device includes:
  • the first encoding module 1110 is configured to call the first encoding network in the image processing model, encode the first sample image, and obtain the first feature map of the first sample image;
  • the first sample image is a sample medical image of a first modality of a target medical object;
  • the decoding module 1120 is configured to call the decoding network in the image processing model, perform decoding based on the first feature map, and obtain a predicted segmented image of the first sample image; the predicted segmented image is used to indicate the predicted at least one of the specified types of regions;
  • a generating module 1130 configured to call a generating network in the image processing model, and generate a predicted generated image based on the first feature map; the predicted generated image is a predicted image of a second modality of the first sample image ;
  • a model training module 1140 configured to train the image processing model based on the difference between the predicted segmented image and the label image, the difference between the predicted generated image and the second sample image; the first The second sample image is a sample medical image of the second modality of the target medical object; the label image corresponds to the target medical object and is used to indicate at least one image of a specified type of region.
  • model training module 1140 includes:
  • a first determination submodule configured to determine a function value of a first loss function based on the difference between the predicted segmentation image and the label image;
  • a second determining submodule configured to determine a function value of a second loss function based on the difference between the predicted generated image and the second sample image
  • the model training submodule is configured to train the image processing model based on the function value of the first loss function and the function value of the second loss function.
  • the model training submodule is configured to update parameters of the first encoding network and parameters of the decoding network based on a function value of the first loss function;
  • the parameters of the first encoding network and the parameters of the generating network are updated.
  • the first determining submodule includes:
  • a first determination unit configured to determine a function value of a first branch function of the first loss function based on the similarity between the predicted segmentation image and the label image;
  • the second determining unit is configured to determine a second value of the first loss function based on the position of at least one region of a specified type predicted in the predicted segmented image and the position of at least one region of a specified type in the label image. the function value of the branch function;
  • a third determining unit configured to determine the function value of the first loss function based on the function value of the first branch function and the function value of the second branch function.
  • the first determination unit is configured to obtain weight values corresponding to each divided area in the predicted segmented image; each divided area in the predicted segmented image includes the at least a specified type of area;
  • the device further includes:
  • a discrimination module configured to invoke a discriminator to discriminate the predicted generated image, and obtain a discrimination result of the predicted generated image
  • a third determining module configured to determine a function value of a third loss function based on the discrimination result; the discrimination result is used to indicate whether the predicted generated image is a real image;
  • the model training module 1140 is configured to train the image processing model based on the function value of the first loss function, the function value of the second loss function and the function value of the third loss function.
  • the first encoding network includes N encoding layers, and the N encoding layers are connected in pairs, N ⁇ 2, and is a positive integer;
  • the first encoding module 1110 includes:
  • a set acquisition submodule configured to acquire a first image pyramid of the first sample image, where the first image pyramid is an image set obtained by downsampling the first sample image according to a specified gradient, and the first The image pyramid includes N first images to be processed;
  • An encoding submodule configured to input the N first images to be processed into corresponding encoding layers, encode the N first images to be processed, and obtain the Nth images of the first sample image a feature map;
  • the input of the target coding layer further includes the first feature map output by the previous coding layer.
  • the decoding network in the image processing model includes N decoding layers, and the N decoding layers are connected in pairs, and the N decoding layers and the N encoding layers are one by one correspond;
  • the decoding module 1120 includes:
  • the decoding sub-module is configured to input the N first feature maps into corresponding decoding layers, decode the N first feature maps, and obtain N decoding results; the N decoding results have the same resolution;
  • a merging submodule configured to combine the N decoding results to obtain a predicted segmented image of the first sample image
  • the input of the target decoding layer in response to the fact that the target decoding layer is not the first decoding layer among the N decoding layers, the input of the target decoding layer also includes the decoding result output by the previous decoding layer.
  • the device further includes:
  • An image acquisition module configured to acquire a prior constrained image of the image processing model based on a third sample image;
  • the third sample image is a sample medical image of a third modality of the target medical object; the prior constraining the image to indicate the position of the target medical object in the third sample image;
  • the second encoding module is configured to call a second encoding network in the image processing model, perform encoding based on the prior constraint image, and obtain a second feature map of the third sample image;
  • a merging module configured to merge the first feature map and the second feature map to obtain a comprehensive feature map
  • the decoding module 1120 is configured to call the decoding network in the image processing module, perform decoding based on the integrated feature map, and obtain the predicted segmented image of the first sample image;
  • the generation module 1130 is configured to call a generation network in the image processing model to generate the predicted generation image based on the integrated feature map.
  • the device further includes:
  • a cropping module configured to crop the prior constrained image based on the position of the target medical object
  • the second encoding module is configured to invoke a second encoding network in the image processing model to encode the cropped prior constrained image to obtain a second feature map of the third sample image.
  • the image acquisition module is configured to invoke a semantic segmentation network to process the third sample image, and acquire a priori constrained image of the image processing model.
  • the parameters in the second coding network share the parameter weights in the first coding network.
  • the image processing device for medical images obtains a multi-modal sample medical image of a target medical object, and a label image corresponding to the target medical image that contains a specified type of region label, Generate a predicted segmentation image based on the first sample image in the multi-modal sample medical image, and predict the generated image, based on the difference between the predicted segmentation image and the label image, predict the difference between the generated image and the second sample image of the target medical object
  • the difference between them is used to train the image processing model including the first encoding network, decoding network and generating network, so that the trained image processing model can obtain the characteristics of multi-modal medical images based on single-modal medical images,
  • the information contained in the obtained medical image segmentation results is more comprehensive, and the segmentation effect of medical images is improved;
  • the image processing model obtained based on training can generate medical images of other modalities based on a single modality of medical images, thereby solving the problem of missing images in the process of medical image analysis.
  • Fig. 12 shows a structural block diagram of a computer device 1200 shown in an exemplary embodiment of the present application.
  • the computer device can be implemented as the server in the above solutions of the present application.
  • the computer device 1200 includes a central processing unit (Central Processing Unit, CPU) 1201, a system memory 1204 including a random access memory (Random Access Memory, RAM) 1202 and a read-only memory (Read-Only Memory, ROM) 1203, and A system bus 1205 that connects the system memory 1204 and the central processing unit 1201 .
  • the computer device 1200 also includes a mass storage device 1206 for storing an operating system 1209 , application programs 1210 and other program modules 1211 .
  • the mass storage device 1206 is connected to the central processing unit 1201 through a mass storage controller (not shown) connected to the system bus 1205 .
  • the mass storage device 1206 and its associated computer-readable media provide non-volatile storage for the computer device 1200 . That is, the mass storage device 1206 may include a computer-readable medium (not shown) such as a hard disk or a Compact Disc Read-Only Memory (CD-ROM) drive.
  • a computer-readable medium such as a hard disk or a Compact Disc Read-Only Memory (CD-ROM) drive.
  • Computer-readable media may comprise computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include RAM, ROM, Erasable Programmable Read Only Memory (EPROM), Electronically Erasable Programmable Read-Only Memory (EEPROM) flash memory or other Solid state storage technology, CD-ROM, Digital Versatile Disc (DVD) or other optical storage, tape cartridge, tape, disk storage or other magnetic storage device.
  • EPROM Erasable Programmable Read Only Memory
  • EEPROM Electronically Erasable Programmable Read-Only Memory
  • CD-ROM Compact Disc
  • DVD Digital Versatile Disc
  • the computer storage medium is not limited to the above-mentioned ones.
  • the above-mentioned system memory 1204 and mass storage device 1206 may be collectively referred to as memory.
  • the computer device 1200 may also operate on a remote computer connected to a network through a network such as the Internet. That is, the computer device 1200 can be connected to the network 1208 through the network interface unit 1207 connected to the system bus 1205, or in other words, the network interface unit 1207 can also be used to connect to other types of networks or remote computer systems (not shown ).
  • the memory also includes at least one instruction, at least one section of program, code set or instruction set, the at least one instruction, at least one section of program, code set or instruction set are stored in the memory, and the central processing unit 1201 executes the at least one instruction, At least one program, code set or instruction set implements all or part of the steps in the image processing method for medical images shown in the above embodiments.
  • Fig. 13 shows a structural block diagram of a computer device 1300 provided by an exemplary embodiment of the present application.
  • the computer device 1300 may be implemented as the above-mentioned terminal, such as a smart phone, a tablet computer, a notebook computer or a desktop computer.
  • the computer device 1300 may also be called user equipment, portable terminal, laptop terminal, desktop terminal, or other names.
  • a computer device 1300 includes: a processor 1301 and a memory 1302 .
  • the processor 1301 may include one or more processing cores, such as a 4-core processor, a 13-core processor, and the like.
  • Processor 1301 can adopt at least one hardware form in DSP (Digital Signal Processing, digital signal processing), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, programmable logic array) accomplish.
  • Processor 1301 may also include a main processor and a coprocessor, the main processor is a processor for processing data in a wake-up state, and is also called a CPU (Central Processing Unit, central processing unit); the coprocessor is Low-power processor for processing data in standby state.
  • CPU Central Processing Unit
  • the processor 1301 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is used for rendering and drawing the content that needs to be displayed on the display screen.
  • the processor 1301 may also include an AI (Artificial Intelligence, artificial intelligence) processor, where the AI processor is used to process computing operations related to machine learning.
  • AI Artificial Intelligence, artificial intelligence
  • Memory 1302 may include one or more computer-readable storage media, which may be non-transitory.
  • the memory 1302 may also include high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices.
  • the non-transitory computer-readable storage medium in the memory 1302 is used to store at least one instruction, and the at least one instruction is used to be executed by the processor 1301 to implement the methods provided by the method embodiments in this application. All or part of steps in an image processing method for medical images.
  • the computer device 1300 may optionally further include: a peripheral device interface 1303 and at least one peripheral device.
  • the processor 1301, the memory 1302, and the peripheral device interface 1303 may be connected through buses or signal lines.
  • Each peripheral device can be connected to the peripheral device interface 1303 through a bus, a signal line or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 1304 , a display screen 1305 , a camera assembly 1306 , an audio circuit 1307 and a power supply 1309 .
  • the peripheral device interface 1303 may be used to connect at least one peripheral device related to I/O (Input/Output, input/output) to the processor 1301 and the memory 1302 .
  • the processor 1301, memory 1302 and peripheral device interface 1303 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 1301, memory 1302 and peripheral device interface 1303 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • computing device 1300 also includes one or more sensors 1310 .
  • the one or more sensors 1310 include, but are not limited to: an acceleration sensor 1311 , a gyro sensor 1312 , a pressure sensor 1313 , an optical sensor 1315 and a proximity sensor 1316 .
  • FIG. 13 does not constitute a limitation to the computer device 1300, and may include more or less components than shown in the figure, or combine certain components, or adopt a different component arrangement.
  • a computer-readable storage medium for storing at least one instruction, at least one program, code set or instruction set, the at least one instruction, the at least one program, the The above code set or instruction set is loaded and executed by the processor to implement all or part of the steps in the above image processing method for medical images.
  • the computer-readable storage medium can be a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a read-only optical disc (Compact Disc Read-Only Memory, CD-ROM), Magnetic tapes, floppy disks, and optical data storage devices, etc.
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device performs all or part of the steps of the method shown in any one of the above-mentioned Figure 2, Figure 4 or Figure 5 .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Image Processing (AREA)

Abstract

一种用于医学图像的图像处理方法、装置、设备及存储介质,涉及医疗技术领域。该方法包括:调用第一编码网络对目标医学对象的第一模态的样本医学图像进行编码,获得第一特征图(210);调用解码网络,基于第一特征图获得用以指示预测出的至少一个指定类型区域的预测分割图像(220);调用生成网络基于第一特征图生成第二模态的预测生成图像(230);基于预测分割图像与标签图像之间的差异,预测生成图像与目标医学对象的第二模态的样本医学图像之间的差异,对图像处理模型进行训练(240)。通过上述方法,使得训练获得图像处理模型能够获得较为全面的医学图像分割结果,提高了对医学图像的分割效果。

Description

用于医学图像的图像处理方法、装置、设备及存储介质
本申请要求于2021年08月16日提交的、申请号为202110938701.X、发明名称为“用于医学图像的图像处理方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及医疗技术领域,特别涉及用于医学图像的图像处理方法、装置、设备及存储介质。
背景技术
在医疗领域,通过医学成像技术进行医学图像分割已成为辅助医生进行病例判断的常用技术。
在相关技术中,通常将医学图像输入到神经网络模型中,基于神经网络提取到的医学图像特征进行医学图像分割,从而获得医学图像分割结果。
然而,上述技术中的神经网络模型,其关注的医学图像特征往往与输入的医学图像的强表现性特征,对医学图像中弱表现性特性关注较少,使得获得的医学图像分割结果包含的信息不全面,使得医学图像分割效果较差。
发明内容
本申请实施例提供了一种用于医学图像的图像处理方法、装置、设备及存储介质。该技术方案如下:
一方面,提供了一种用于医学图像的图像处理方法,所述方法由计算机设备执行,所述方法包括:
调用图像处理模型中的第一编码网络,对第一样本图像进行编码,获得所述第一样本图像的第一特征图;所述第一样本图像是目标医学对象的第一模态的样本医学图像;
调用所述图像处理模型中的解码网络,基于所述第一特征图进行解码,获得所述第一样本图像的预测分割图像;所述预测分割图像用以指示预测出的至少一个指定类型区域;
调用所述图像处理模型中的生成网络,基于所述第一特征图生成预测生成图像;所述预测生成图像是所述第一样本图像的第二模态的预测图像;
基于所述预测分割图像与标签图像之间的差异,所述预测生成图像与第二样本图像之间的差异,对所述图像处理模型进行训练;所述第二样本图像是所述目标医学对象的第二模态的样本医学图像;所述标签图像是与所述目标医学对象对应的,用以指示至少一个指定类型区域的图像。
另一方面,提供了一种用于医学图像的图像处理装置,所述装置包括:
第一编码模块,用于调用图像处理模型中的第一编码网络,对第一样本图像进行编码,获得所述第一样本图像的第一特征图;所述第一样本图像是目标医学对象的第一模态的样本医学图像;
解码模块,用于调用所述图像处理模型中的解码网络,基于所述第一特征图进行解码,获得所述第一样本图像的预测分割图像;所述预测分割图像用以指示预测出的至少一个指定类型区域;
生成模块,用于调用所述图像处理模型中的生成网络,基于所述第一特征图生成预测生成图像;所述预测生成图像是所述第一样本图像的第二模态的预测图像;
模型训练模块,用于基于所述预测分割图像与所述标签图像之间的差异,所述预测生成图像与第二样本图像之间的差异,对所述图像处理模型进行训练;所述第二样本图像是所述目标医学对象的第二模态的样本医学图像;所述标签图像是与所述目标医学对象对应的,用以指示至少一个指定类型区域的图像。
在一种可能的实现方式中,所述模型训练模块,包括:
第一确定子模块,用于基于所述预测分割图像与所述标签图像之间的差异,确定第一损失函数的函数值;
第二确定子模块,用于基于所述预测生成图像与所述第二样本图像之间的差异,确定第二损失函数的函数值;
模型训练子模块,用于基于所述第一损失函数的函数值,以及所述第二损失函数的函数值,对所述图像处理模型进行训练。
在一种可能的实现方式中,所述模型训练子模块,用于基于所述第一损失函数的函数值,对所述第一编码网络的参数以及所述解码网络的参数进行更新;
基于所述第二损失函数的函数值,对所述第一编码网络的参数以及所述生成网络的参数进行更新。
在一种可能的实现方式中,所述第一确定子模块,包括:
第一确定单元,用于基于所述预测分割图像与所述标签图像的相似性,确定所述第一损失函数的第一分支函数的函数值;
第二确定单元,用于基于所述预测分割图像中预测出的至少一个指定类型区域的位置,与所述标签图像中的至少一个指定类型区域的位置,确定所述第一损失函数的第二分支函数的函数值;
第三确定单元,用于基于所述第一分支函数的函数值以及第二分支函数的函数值,确定所述第一损失函数的函数值。
在一种可能的实现方式中,所述第一确定单元,用于获取与所述预测分割图像中的各个划分区域分别对应的权重值;所述预测分割图像中的各个划分区域包含所述至少一个指定类型区域;
基于与所述预测分割图像中的各个划分区域分别对应的权重值,以及所述预测分割图像中的各个划分区域与所述标签图像中的各个划分区域的相似性,确定所述第一损失函数的第一分支函数的函数值。
在一种可能的实现方式中,所述装置还包括:
判别模块,用于调用判别器对所述预测生成图像进行判别,获得所述预测生成图像的判别结果;
第三确定模块,用于基于所述判别结果,确定第三损失函数的函数值;所述判别结果用以指示所述预测生成图像是否为真实图像;
所述模型训练模块,用于基于所述第一损失函数的函数值,所述第二损失函数的函数值以及所述第三损失函数的函数值,对所述图像处理模型进行训练。
在一种可能的实现方式中,所述第一编码网络包含N个编码层,且所述N个编码层两两相连,N≥2,且为正整数;
所述第一编码模块,包括:
集合获取子模块,用于获取所述第一样本图像的第一图像金字塔,所述第一图像金字塔是对所述第一样本图像按照指定梯度下采样获取的图像集合,所述第一图像金字塔中包含N个第一待处理图像;
编码子模块,用于将所述N个第一待处理图像分别输入到对应的编码层中,对所述N个第一待处理图像进行编码,获得所述第一样本图像的N个第一特征图;
其中,响应于目标编码层为所述N个编码层中的非第一个编码层,所述目标编码层的输 入还包括上一个编码层输出的第一特征图。
在一种可能的实现方式中,所述图像处理模型中的解码网络包含N个解码层,且所述N个解码层两两相连,所述N个解码层与所述N个编码层一一对应;
所述解码模块,包括:
解码子模块,用于将所述N个第一特征图分别输入到对应的解码层中,对所述N个第一特征图进行解码,获得N个解码结果;所述N个解码结果具有相同的分辨率;
合并子模块,用于对所述N个解码结果进行合并,获得所述第一样本图像的预测分割图像;
其中,响应于目标解码层为所述N个解码层中的非第一个解码层,所述目标解码层的输入还包括上一个解码层输出的解码结果。
在一种可能的实现方式中,所述装置还包括:
图像获取模块,用于基于第三样本图像,获取所述图像处理模型的先验约束图像;所述第三样本图像是所述目标医学对象的第三模态的样本医学图像;所述先验约束图像用以指示所述目标医学对象在所述第三样本图像中的位置;
第二编码模块,用于调用所述图像处理模型中的第二编码网络,基于所述先验约束图像进行编码,获得所述第三样本图像的第二特征图;
合并模块,用于对所述第一特征图以及所述第二特征图进行合并,获得综合特征图;
所述解码模块,用于调用所述图像处理模块中的解码网络,基于所述综合特征图进行解码,获得所述第一样本图像的所述预测分割图像;
所述生成模块,用于调用所述图像处理模型中的生成网络,基于所述综合特征图生成所述预测生成图像。
在一种可能的实现方式中,所述装置还包括:
裁剪模块,用于基于所述目标医学对象的位置,对所述先验约束图像进行裁剪;
所述第二编码模块,用于调用所述图像处理模型中的第二编码网络,对裁剪后的所述先验约束图像进行编码,获得所述第三样本图像的第二特征图。
在一种可能的实现方式中,所述图像获取模块,用于调用语义分割网络,对所述第三样本图像进行处理,获取所述图像处理模型的先验约束图像。
在一种可能的实现方式中,所述第二编码网络中的参数与所述第一编码网络中的参数权值共享。
另一方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储由至少一条指令,所述至少一条指令由所述处理器加载并执行以实现上述用于医学图像的图像处理方法。
另一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条计算机程序,所述计算机程序由处理器加载并执行以实现上述用于医学图像的图像处理方法。
另一方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述各种可选实现方式中提供的用于医学图像的图像处理方法。
本申请实施例提供的用于医学图像的图像处理方法,获取目标医学对象的多模态的样本医学图像,以及目标医学图像的包含指定类型区域标签的标签图像,基于多模态的样本医学图像中的第一样本图像生成预测分割图像,以及预测生成图像,基于预测分割图像与标签图像之间的差异,预测生成图像与目标医学对象的第二样本图像之间的差异,对包含第一编码网络,解码网络以及生成网络的图像处理模型进行训练,从而使得训练获得图像处理模型能够基于单一模态的医学图像,获取到多模态的医学图像的特征,使得获得的医学图像分割结 果包含的信息较为全面,提高了对医学图像的分割效果;
进一步的,基于训练获得的图像处理模型可以基于单一模态的医学图像,生成其他模态的医学图像,从而解决医学图像分析过程中的图像缺失问题。
附图说明
图1示出了本申请一示例性实施例提供的用于医学图像的图像处理方法的***架构的示意图;
图2示出了本申请一示例性实施例提供的用于医学图像的图像处理方法的流程图;
图3是根据一示例性实施例示出的一种图像处理模型生成以及图像处理的框架图;
图4示出了本申请一示例性实施例提供的用于医学图像的图像处理方法的流程图;
图5示出了本申请一示例性实施例示出的用于医学图像的图像处理方法的流程图;
图6示出了本申请一示例性实施例示出的近似标记的合成示意图;
图7示出了本申请一示例性实施例示出的图像处理模型的结构示意图;
图8示出了本申请一示例性实施例示出的编码层的结构示意图;
图9示出了本申请一示例性实施例示出的解码层的结构示意图;
图10示出了本申请一示例性实施例示出的图像处理模型的应用过程的示意图;
图11示出了本申请一示例性实施例示出的用于医学图像的图像处理装置的方框图;
图12示出了本申请一示例性实施例示出的计算机设备的结构框图;
图13示出了本申请一示例性实施例示出的计算机设备的结构框图。
具体实施方式
图1示出了本申请一示例性实施例提供的用于医学图像的图像处理方法的***架构的示意图,如图1所示,该***包括:计算机设备110以及医学图像采集设备120。
其中,上述计算机设备110可以实现为终端或服务器,当该计算机设备110实现为服务器时,该计算机设备110可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式***,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN(Content Delivery Network,内容分发网络)、以及大数据和人工智能平台等基础云计算服务的云服务器。当该计算机设备110实现为终端时,该计算机设备110可以是智能手机、平板电脑、膝上型便携计算机和台式计算机等等。
上述医学图像采集设备120为具有医学图像采集功能的设备,比如该医学图像采集设备可以是用于医学检测的CT(Computed Tomography,计算机断层扫描)检测仪,核磁共振仪,正电子发射计算机断层扫描仪,心脏磁共振仪等带有图像采集装置的设备等。示意性的,以心脏磁共振仪为例,心脏磁共振成像(Cardiac Magnetic Resonance,CMR)是指用磁共振成像技术诊断心脏及大血管疾病的方法。磁共振是一种无创口的成像技术;基于心脏磁共振成像获得的CMR图像可以提供心脏的解剖和功能信息,以辅助心脏疾病的临床诊断和治疗,比如,CMR图像可以辅助对于心肌梗死的临床诊断和治疗。
心脏磁共振成像是一种多模态的成像方法,不同的CMR成像序列对应有不同的显像重点,用于提供不同的心脏特征信息,示意性的,CMR的成像序列可以包括:平衡式稳态自由进动序列(balanced-Steady State Free Precession,bSSFP),该序列可以捕获心脏运动,以使得其对应的bSSFP图像可以呈现完整清晰的心肌边界;T2加权成像,其对应的T2加权图像可以清楚地显示心肌水肿或心肌缺血损伤,比如,T2加权图像以高亮显示的形式显示心肌水肿部位或心肌缺血损伤部位;延迟钆增强(Late Gadolinium Enhancement,LGE)技术,其对应的LGE图像可以突出显示心肌瘢痕或心肌梗塞区域。通过组合多个图像序列,可以获得关于心肌病理学和形态学的丰富且可靠的信息,以辅助临床诊断和治疗计划的设定,需要说明的是,上述对CMR的成像序列的说明仅为示意性的,相关人员可以根据实际需求设置不同的 成像序列,以获取不同的CMR图像,本申请对此不进行限制。进一步的,本申请所示的多模态的医学图像可以是基于不同的医学图像采集设备获取到的对应于同一医学对象的医学图像,比如,多模态的医学图像可以包含T1加权图像,T2加权图像以及CT图像等医学图像。
可选的,上述***中包含一个或者多个计算机设备110,以及一个或多个医学图像采集设备120。本申请实施例对于计算机设备110和医学图像采集设备120的个数不做限制。
医学图像采集设备120以及计算机设备110通过通信网络相连。可选的,通信网络是有线网络或无线网络。
可选的,上述的无线网络或有线网络使用标准通信技术和/或协议。网络通常为因特网、但也可以是任何网络,包括但不限于局域网(Local Area Network,LAN)、城域网(Metropolitan Area Network,MAN)、广域网(Wide Area Network,WAN)、移动、有线或者无线网络、专用网络或者虚拟专用网络的任何组合。在一些实施例中,使用包括超文本标记语言(Hyper Text Mark-up Language,HTML)、可扩展标记语言(Extensible Markup Language,XML)等的技术和/或格式来代表通过网络交换的数据。此外还可以使用诸如安全套接字层(Secure Socket Layer,SSL)、传输层安全(Transport Layer Security,TLS)、虚拟专用网络(Virtual Private Network,VPN)、网际协议安全(Internet Protocol Security,IPsec)等常规加密技术来加密所有或者一些链路。在另一些实施例中,还可以使用定制和/或专用数据通信技术取代或者补充上述数据通信技术。本申请在此不做限制。
图2示出了本申请一示例性实施例提供的用于医学图像的图像处理方法的流程图,该方法由计算设备执行,该计算机设备可以实现图1所示的服务器,如图2所示,该用于医学图像的图像处理方法包括以下步骤:
步骤210,调用图像处理模型中的第一编码网络,对第一样本图像进行编码,获得第一样本图像的第一特征图;该第一样本图像是目标医学对象的第一模态的样本医学图像。
在本申请实施例中,对于医学图像而言,该指定类型区域标签可以用以表示第一样本图像中的病灶信息,该病灶信息可以包括病灶在第一样本图像中的位置和形状等信息。
本申请实施例中的样本图像(包括第一样本图像,第二样本图像),可以是通过医学图像采集设备获取的医学图像,比如图1所示的医学图像采集设备;或者,该样本图像可以是基于数据库中存储的医学图像数据获取的,示意性的,本申请实施例中涉及的样本图像可以是基于公开数据集MyoPS20中的图像数据获取的,该数据集由多序列心肌病例CMR组成,包括45例的bSSFP图像、T2加权图像和LGE图像,其中25例带有标签。对于每位患者的原始CMR序列,bSSFP图像由8-12个切片组成,面内分辨率为1.25×1.25毫米,切片厚度为8至13毫米。T2加权图像由3-7个切片组成,面内分辨率为1.35×1.35mm,切片厚度为12-20mm。LGE图像有10-18个切片,面内分辨率为0.75×0.75mm,切片厚度为5mm。将上述图像对齐到一个公共空间并重新采样到相同的空间分辨率,以获得本申请中的样本图像。
该第一样本图像的标签图像可以包含第一样本图像中分辨率较低的其他病灶区域,示意性的,以第一样本图像是目标医学对象的T2加权图像为例,其中分辨率较高的病灶区域为心肌水肿区域,若该T2加权图像的目标医学对象对应有心肌瘢痕,那么该T2加权图像的标签图像中,除了包含心肌水肿区域标签之外,还可以包含心肌瘢痕区域标签。相对应的,当第一样本图像是目标医学对象的LGE图像时,其中分辨率较高的病灶区域为心肌瘢痕区域,该LGE图像的标签图像中,除了包含心肌瘢痕区域标签之外,还可以包含心肌水肿区域标签;也就是说,同一个目标医学对象的不同模态的医学图像所对应的标签图像相同。
样本医学图像的模态用以指示医学图像的获取方式,示意性的,第一模态的样本医学图像可以是T2加权图像,或者,第一模态的样本医学图像也可以是LEG图像,或者,第一模态的样本医学图像也可以其他任意一种医学图像采集方式采集到的医学图像。
步骤220,调用图像处理模型中的解码网络,基于第一特征图进行解码,获得第一样本图 像的预测分割图像;该预测分割图像用以指示预测出的至少一个指定类型区域。
可选的,该预测分割图像中的预测出的指定类型区域的数量与标签图像中指定标签区域的数量相等。示意性的,该预测指定类型区域可以是基于第一编码网络和解码网络处理后,预测出的该第一样本图像中的病灶区域。
步骤230,调用图像处理模型中的生成网络,基于第一特征图生成预测生成图像;该预测生成图像是第一样本图像的第二模态的预测图像。
其中,第一样本图像所属的第一模态,与预测生成图像所属的第二模态不同。
步骤240,基于预测分割图像与标签图像之间的差异,预测生成图像与第二样本图像之间的差异,对图像处理模型进行训练;该第二样本图像是目标医学对象的第二模态的样本医学图像;该标签图像是与目标医学对象对应的,用以指示至少一个指定类型区域的图像。
计算机设备获取不同的第一模态的样本医学图像为第一样本图像,迭代执行上述步骤210至步骤240,基于各个第一样本图像的预测分割图像与标签图像之间的差异,预测生成图像与第二样本图像之间的差异,对图像处理模型中的参数进行迭代更新,直至达到训练完成条件,该训练完成条件包括:图像处理模型收敛,迭代次数达到次数阈值等等。
训练完成后的图像处理模型可以用于对输入的第一模态的目标医学图像进行医学图像分割,获得该目标医学图像中的指定类型区域,和/或,生成该目标医学图像的第二模态的医学图像。
综上所述,本申请实施例提供的用于医学图像的图像处理方法,获取目标医学对象的多模态的样本医学图像,以及与目标医学图像对应的包含指定类型区域标签的标签图像,基于多模态的样本医学图像中的第一样本图像生成预测分割图像,以及预测生成图像,基于预测分割图像与标签图像之间的差异,预测生成图像与目标医学对象的第二样本图像之间的差异,对包含第一编码网络,解码网络以及生成网络的图像处理模型进行训练,从而使得训练获得图像处理模型能够基于单一模态的医学图像,获取到多模态的医学图像的特征,使得获得的医学图像分割结果包含的信息较为全面,提高了对医学图像的分割效果;
进一步的,基于训练获得的图像处理模型可以基于单一模态的医学图像,生成其他模态的医学图像,从而解决医学图像分析过程中的图像缺失问题。
在本申请实施例所述的方案中,通过同一目标医学对象的多模态医疗样本图像,以及与目标医学对象对应的标签图像训练获得图像处理模型,可以提高图像处理模型的医学图像分割效果,以及,解决医学图像分析过程中的图像缺失问题。上述方案的应用场景包括但不限于以下场景:
1)心肌梗塞诊断和治疗场景:
心肌活力的评估对于心肌梗塞患者的诊断和治疗管理至关重要,在实际应用中,可以通过心脏核磁共振(CMR)成像技术,获取心脏对应于成像序列的CMR图像,以提供心脏的解剖和功能信息。不同的成像序列可以成像并提供心脏的不同特征信息,包括显示心肌梗塞区域的延迟钆增强(LGE)图像,对心肌水肿或心肌缺血损伤进行高亮的T2加权图像,以及能够捕获心脏运动并呈现清晰的界限的平衡式稳态自由进动序列(bSSFP)序列图像,这些多序列CMR图像可以提供关于心肌病理学和形态学信息的丰富和可靠的信息,为医生的诊断和治疗计划提供帮助。然而在单一模态的场景下,基于单一模态的医学图像所能获取到的心脏的信息有限,比如,在只存在T2加权图像时,只能基于T2加权图像获取到较为明确的心肌水肿或者心肌缺血损伤,而难以获取到心肌梗塞(心肌瘢痕)区域的信息;在只存在LGE图像时,只能获取到较为明确的心肌梗塞区域,而难以获取到心肌水肿或心肌缺血损伤的信息;在此情况下,可以基于本申请实施例提供的用于医学图像的图像处理方法分别获得的T2加权图像以及LGE图像各自的图像处理模型,即将T2加权模态的样本医学图像获取为第一样本图像训练获得的图像处理模型,以及将LGE模态的样本医学图像获取为第一样本图像训练 获得的图像处理模型,将该T2加权图像输入到T2加权模态的图像处理模型中,以获得包含心肌瘢痕区域以及心肌水肿(或心肌缺血损伤)的分割图像,和/或T2加权图像对应LGE图像;或者,将该LGE图像输入到LGE模态对应的图像处理模型中,以获得包含心肌瘢痕区域以及心肌水肿(或心肌缺血损伤)的分割图像,和/或,LGE图像对应的T2加权图像。
2)医学图像病灶判断场景:
在医学领域,医护人员往往通过医学图像采集设备获取到的医学图像对器官的病灶区域进行判断,比如,对胃部进行病灶检查,确认胃溃疡;确认肺部肿瘤;以及确认脑部肿瘤等。在上述场景中,可以通过本申请提供的用于医学图像的图像处理方法,获得对应于上述各个场景的图像处理模型,用以确定病灶在器官中的位置形状等信息,比如,确定胃溃疡在胃部的病灶位置、形状和大小,以使得医护人员基于存在病灶的位置、形状和大小等进行医疗资源分配;因此,基于本申请提供的用于医学图像的图像处理方法所获得的图像处理模型,能够提高对医疗图像进行分割的准确性,可以进一步提高病灶判断的准确性,从而实现对医疗资源的合理分配。
本申请涉及的方案包括图像处理模型生成阶段以及图像处理阶段。图3是根据一示例性实施例示出的一种图像处理模型生成以及图像处理的框架图,如图3所示,在图像处理模型生成阶段,图像处理模型生成设备310通过预先设置好的训练样本数据集(包括第一模态的样本医学图像以及样本医学图像对应目标医学对象的标签图像),得到图像处理模型;之后,基于该图像处理模型生成图像处理模型。在图像处理阶段,图像处理设备320基于该图像处理模型,对输入的第一模态的目标医学图像进行处理,获得该第一模态的目标医学图像的图像分割结果,该图像分割结果中可以包含目标医学图像对应的医学对象在多个模态的医学图像中所能获取的病症区域标注,比如,确定该目标医学图像对应的医学对象中的至少一个病灶位置、形状等信息;和/或,对输入的第一模态的目标医学图像进行处理,获得该第一模态的目标医学图像的图像生成结果,生成第一模态的目标医学图像对应的医学对象的第二模态医学图像,以解决图像缺失的问题,同时使得图像分割结果具有可解释性。
在一种可能的实现方式中,对图像处理模型进行应用时,若需要对目标医学图像进行图像分割,则可以使用图像处理模型中的第一编码网络和解码网络,或者,基于图像处理模型中的第一编码网络和解码网络,可以重新构建一个图像分割模型,该图像分割模型中的参数与图像处理模型中第一编码网络和解码网络的参数保持一致;在另一种可能的实现方式中,对图像处理模型进行应用时,若需要基于输入的第一模态的目标医疗图像,生成对应的第二模态的医疗图像,则可以使用图像处理模型中的第一编码网络和生成网络,或者,基于图像处理模型中的第一编码网络和生成网络,可以重新构建一个图像生成模型,该图像生成模型中的参数与图像处理模型中第一编码网络和生成网络的参数保持一致。
其中,上述图像处理模型生成设备310和图像处理设备320可以是计算机设备,比如,该计算机设备可以是个人电脑、服务器等固定式计算机设备,或者,该计算机设备也可以是平板电脑、电子书阅读器等移动式计算机设备。
可选的,上述图像处理模型生成设备310和图像处理设备320可以是同一个设备,或者,图像处理模型生成设备310和图像处理设备320也可以是不同的设备。并且,当图像处理模型生成设备310和图像处理设备320是不同设备时,图像处理模型生成设备310和图像处理设备320可以是同一类型的设备,比如图像处理模型生成设备310和图像处理设备320可以都是服务器;或者图像处理模型生成设备310和图像处理设备320也可以是不同类型的设备,比如图像处理设备320可以是个人电脑或者终端,而图像处理模型生成设备310可以是服务器等。本申请实施例对图像处理模型生成设备310和图像处理设备320的具体类型不做限定。
图4示出了本申请一示例性实施例提供的用于医学图像的图像处理方法的流程图,该方法由计算设备执行,该计算机设备可以实现为图1所示的服务器,如图4所示,该用于医学 图像的图像处理方法包括以下步骤:
步骤410,调用图像处理模型中的第一编码网络,对第一样本图像进行编码,获得第一样本图像的第一特征图;该第一样本图像是目标医学对象的第一模态的样本医学图像。
步骤420,调用图像处理模型中的解码网络,基于第一特征图进行解码,获得第一样本图像的预测分割图像;该预测分割图像用以指示预测出的至少一个指定类型区域。
步骤430,调用图像处理模型中的生成网络,基于第一特征图生成预测生成图像;该预测生成图像是第一样本图的第二模态的预测图像。
步骤440,基于预测分割图像与标签图像之间的差异,确定第一损失函数的函数值。
在本申请实施例中,基于预测分割图像与标签图像的相似性,确定第一损失函数的第一分支函数的函数值;
基于预测分割图像中的至少一个指定类型区域,与标签图像中的至少一个指定类型区域标签,确定第一损失函数的第二分支函数的函数值;
基于第一分支函数的函数值以及第二分支函数的函数值,确定第一损失函数的函数值。
在一种可能的实现方式中,上述基于预测分割图像与标签图像的相似性,确定第一损失函数的第一分支函数的函数值的过程可以包括:获取与预测分割图像中的各个划分区域分别对应的权重值;预测分割图像中的各个划分区域包含至少一个指定类型区域;基于与预测分割图像中的各个划分区域分别对应的权重值,以及预测分割图像中的各个划分区域与标签图像中的各个划分区域的相似性,确定第一损失函数的第一分支函数的函数值。
其中,上述各个划分区域包括指定类型区域;进一步的,上述各个划分区域还可以包含指定类型区域之外的其它至少一个类型的区域,比如背景区域、正常区域等等。
由于同一医学图像中,不同划分区域占医学图像的面积存在较大的差异,即存在正负样本极不平衡的状态,因此,为了平衡正负样本的重要性,可以使用Focal Dice损失L FDL作为第一损失函数中的第一分支函数;在该第一分支函数中,不同的划分区域对应设置有不同的权重,从而使得难以分割的划分区域在分割过程中可以获得更高的权重,使得网络可以专注于学习更难的类别,该第一损失函数的第一分支函数可以表示为:
Figure PCTCN2022107341-appb-000001
其中,ω表示划分区域t的权重,参数
Figure PCTCN2022107341-appb-000002
表示划分区域t的Dice t的幂,示意性的β=2。
其中,Dice系数(Dice coefficient)是用于评估两个样本的相似性的度量函数,取值范围在0到1之间,取值越大表示越相似,在本申请实施例中,这两个样本之间的相似性体现为预测分割图像与标签图像之间的相似性,进一步体现为,预测分割图像中的各个划分区域与标签图像中的划分区域的相似性。示意性的,预测图像中的划分区域可以包含病灶区域,正常区域以及背景区域,一般而言,背景区域的面积占医学图像的面积的比例较大,正常区域的面积占医学图像的面积的比例次之,病灶区域的面积占医学图像的面积的比例最小;为使得解码网络更加关注病灶区域,可以将第一分支函数中,该病灶区域的权重值设置为最大,正常区域的权重值次之,背景区域的权重值最小;可选的,权重值的取值与各个划分区域的面积占医学图像的面积成反比;或者,权重值的取值与各个划分区域的类型相对应,比如,以该医学图像为心肌图像为例,其对应的病灶包括心肌瘢痕以及心肌水肿,则可以设置预测分割图像中的各个划分区域的权重集合为ω={1,1,1,0.5},其中,心肌瘢痕的划分区域的权重为1,心肌水肿的划分区域的权重为1,正常心肌的划分区域的权重为1,背景的划分区域的权重为0.5;需要说明的是,上述权重的设置仅为示意性的,本申请对各个划分区域分别对应的权重值,以及各个权重值之间的关系不进行限定。
在本申请实施例中,可以采用均方差损失函数来量化预测分割图像中预测出的至少一个指定位置区域的位置,与标签图像中至少一个指定类型区域的位置之间的差异,该第一损失 函数的第二分支函数可以表示为:
Figure PCTCN2022107341-appb-000003
其中,H和W分别表示预测分割图像(标签图像)的宽和高,P t和G t分别代表指定类型区域t的预测位置和标签图像位置。可选的,标签图像中的指定类型区域可以包含病灶区域,正常区域以及背景区域中的至少一种。
在本申请实施例中,将第一分支函数的函数值与第二分支函数的函数值的和,获取为第一损失函数的函数值;进一步的,为平衡第一分支函数与第二分支函数的作用,可以为第一分支函数以及第二分支函数设置不同的权重值,示意性的,该第一损失函数可以表示为:
L seg=L FDL+λL mse
其中,λ表示第二分支函数相对于第一分支函数的权重值;示意性的,λ的取值可以为100。
步骤450,基于预测生成图像与第二样本图像之间的差异,确定第二损失函数的函数值;该第二样本图像是目标医学对象的第二模态的样本医学图像。
示意性的,第二损失函数可以表示为:
Figure PCTCN2022107341-appb-000004
其中,x标识预测生成图像,x'表示目标医学对象的第二模态的样本医学图像,H和W分别表示预测生成图像(第二模态的样本医学图像)的宽和高。
步骤460,基于第一损失函数的函数值,以及第二损失函数的函数值,对图像处理模型进行训练。
在本申请实施例中,第一损失函数与第二损失函数对图像处理模型中的不同网络组合进行参数更新;
可选的,基于第一损失函数的函数值,对第一编码网络的参数以及解码网络的参数进行更新;
基于第二损失函数的函数值,对第一编码网络的参数以及生成网络的参数进行更新。
也就是说,第一损失函数的函数值以及第二损失函数的函数值均会对图像处理模型中的第一编码网络的参数更新进行指导;因此,生成网络用以辅助图像分割模型(包含编码网络和解码网络的模型)的生成;或者说,解码网络用以辅助图像生成模型(包含编码网络和生成网络的模型)的生成。
在本申请实施例中,还可以引入第三损失函数,对图像处理模型进行训练,该第三损失函数用以指示预测生成图像的真实性;该过程可以实现为:
调用判别器对预测生成图像进行判别,获得预测生成图像的判别结果;
基于判别结果,确定第三损失函数的函数值;该判别结果用以指示预测生成图像是否为真实图像。
示意性的,该第三损失函数可以表示为:
Figure PCTCN2022107341-appb-000005
其中,G表示生成网络,D表示判别网络(判别器),
Figure PCTCN2022107341-appb-000006
表示真实图像分布,
Figure PCTCN2022107341-appb-000007
代表虚假图像分布。
在上述情况下,对图像处理模型进行训练包括:基于第一损失函数的函数值,第二损失函数的函数值以及第三损失函数的函数值,对图像处理模型进行训练。
在本申请实施例中,基于第三损失函数的函数值,对第一编码网络的参数以及生成网络的参数进行更新。
其中,该判别器可以是预先训练好的;或者,该判别器中的参数可以基于第三损失函数的函数值进行更新,在此情况下,该判别器的输入还包括目标医学对象的第二模态的样本医学图像,以对该判别器进行训练;该判别器对生成网络的训练起到辅助作用,用以提高生成网络生成的图像的质量。
综上所述,本申请实施例提供的用于医学图像的图像处理方法,获取目标医学对象的多模态的样本医学图像,以及与目标医学图像对应的包含指定类型区域标签的标签图像,基于多模态的样本医学图像中的第一样本图像生成预测分割图像,以及预测生成图像,基于预测分割图像与标签图像之间的差异,预测生成图像与目标医学对象的第二样本图像之间的差异,对包含第一编码网络,解码网络以及生成网络的图像处理模型进行训练,从而使得训练获得图像处理模型能够基于单一模态的医学图像,获取到多模态的医学图像的特征,使得获得的医学图像分割结果包含的信息较为全面,提高了对医学图像的分割效果;
进一步的,基于训练获得的图像处理模型可以基于单一模态的医学图像,生成其他模态的医学图像,从而解决医学图像分析过程中的图像缺失问题。
可选的,为了提高模型训练的准确性,降低类不平衡问题带来的误差,可以基于第三样本图像,获取图像处理模型的先验约束图像,该先验约束图像用以指示目标医学对象在样本图像中的预测位置。其中,该第三样本图像可以是第一样本图像和第二样本图像中的一个,或者,该第三样本图像也可以是目标医学图像的第三模态的样本医学图像;对于心脏磁共振获得的CMR图像而言,该第三样本图像可以是能够捕获心脏运动并呈现清晰的界限的bSSFP图像,相较于T2加权图像与LGE图像,基于bSSFP图像可以获得更加准确的心肌位置和形状,因此,基于bSSFP图像获取的先验约束图像对目标医学对象在样本图像中的位置的预测更加准确。
在上述情况下,基于图4所示实施例所示的用于医学图像的图像处理方法,图5示出了本申请一示例性实施例示出的用于医学图像的图像处理方法的流程图,如图5所示,该方法包括以下步骤:
步骤510,基于第三样本图像,获取图像处理模型的先验约束图像;该第三样本图像是目标医学对象的第三模态的样本医学图像;该先验约束图像用以指示目标医学对象在第三样本图像中的位置。
该先验约束图像指示的有目标医学对象在第三样本图像中的位置,能够指示该目标医学图像在其他样本图像(包括第一样本图像与第二样本图像)中的位置。
在本申请实施例中,可以调用语义分割网络(U-Net),对第三样本图像进行处理,获取图像处理模型的先验约束对象;
其中,该语义分割网络是基于样本图像集训练获得的;该样本图像集中包含第四样本图像,以及第四样本图像的近似标记,该第四样本图像是指其他医学对象的第三模态的样本医学图像;该近似标记用以指示第四样本图像中其他医学对象的位置。
以其他医学对象的类型为心肌为例,由于心肌医学图像中,心肌水肿区域,心肌瘢痕区域占医学图像的比例较低,且各自对应的区域并不重合,因此,可以将合并了正常心肌区域、心肌水肿区域以及心肌瘢痕区域的图像,获取为医学图像的近似标记;图6示出了本申请一示例性实施例示出的近似标记的合成示意图,如图6所示,从标签图像中提取心肌水肿区域610以及心肌瘢痕区域620,结合正常心肌区域630合并生成近似标记640;将该近似标记获取为第一样本对象的标签,对语义分割图像进行训练,以使得训练后的语义分割图像可以对输入的第三样本图像进行处理,获得第三样本图像的先验约束图像。
步骤520,调用图像处理模型中的第二编码网络,基于先验约束图像进行编码,获得第三样本图像的第二特征图。
在本申请实施例中,该图像处理模型还可以包含第二编码网络,为了缓解网络参数过大 造成的过拟合问题,在本申请实施例中,可以设置第二编码网络的参数与第一编码网络的参数保持一致,即第二编码网络中的参数与第一编码网络中的参数权值共享。
可选的,为了进一步减少背景区域对模型训练的影响,可以在获得先验约束图像之后,基于目标医学对象的位置,对先验约束图像进行裁剪;之后,调用图像处理模型中的第二编码网络,对裁剪后的先验约束图像进行编码,获得第三样本图像的第二特征图。
与该先验约束图像的大小相适应,对其他样本图像(包括第一样本图像,第二样本图像)进行预处理,即,对其他样本图像进行裁剪,并保证目标医学对象在其他图样本图像中的位置与目标医学对象在先验约束图像中的位置在指定误差范围内相似,可选的,可以在保证目标医学对象在其他图像样本中的位置与目标医学对象在先验约束图像中的位置在指定误差范围内处于图像中心。以目标医学对象为心肌为例,对于先验约束图像而言,由于心肌是圆环形的对称组织,因此根据上述获得的近似标记的中心,对先验约束图像进行裁剪;对于其他样本图像而言,则可以根据标签图像中指定类型区域的位置进行裁剪,或者,也可以根据上述近似标记的中心进行裁剪,本申请对其他样本图像的裁剪依据不进行限制。
可选的,由于不同病例的数据范围差异较大,因此可以对裁剪后的先验图像进行进一不处理,比如,应用直方图均衡和随机伽马方法,在统一设置窗位和窗宽之后对数据分布进行进一步平衡。
此外,在对样本图像进行处理之前,可以先对样本图像进行数据增强处理,该数据增强处理方法包括随机旋转、随机裁剪和随机缩放等方法。
步骤530,调用图像处理模型中的第一编码网络,对第一样本图像进行编码,获得第一样本图像的第一特征图;该第一样本图像是目标医学对象的第一模态的样本医学图像。
若输入第二编码网络的图像为先验约束图像,该第一样本图像为原始的第一样本图像;若输入第二编码网络的图像为裁剪后的先验约束图像,该第一样本图像为裁剪后的第一样本图像;也就是说,输入到各个编码网络中的图像的尺寸保持一致。
在本申请实施例中,为使得生成的预测分割图像和/或预测生成图像更加准确,该图像处理模型可以是基于蝴蝶型网络架构搭建的,该蝴蝶型网络中的第一编码网络包含N个编码层,且N个编码层两两相连;该蝴蝶型网络中的解码网络包含N个解码层,N个解码层两两相连,解码网络中的N个解码层与第一编码网络中的N个编码层一一对应。图7示出了本申请一示例性实施例示出的图像处理模型的结构示意图,如图7所示,该第一编码网络710包含N个编码层,解码网络730包含N个解码层,其中,第一编码网络710中的第一个编码层711与解码网络730中的第N个解码层733一一对应,第一编码网络710中的第二个编码层712与解码网络730中的第N-1个解码层732一一对应;以此类推,第一编码网络710中的第N个编码层713与解码网络730中的第一个解码层731一一对应。可选的,如图7所示,该图像处理模型中包含第二编码层720,该第二编码网络720中也可以包含N个编码层,第二编码层中N个编码层两两相连。
在本申请实施例中,上述N个编码层两两相连,可以是指相邻两个编码层之间相连。
当图像处理模型为图7所示的基于蝴蝶型网络架构搭建的模型时,调用图像处理模型中的第一编码网络,对第一样本图像进行编码,获得第一样本图像对应的第一特征图的过程可以实现为:
获取第一样本图像的第一图像金字塔,该第一图像金字塔是对第一样本图像按照指定梯度下采样获取的图像集合,该第一图像金字塔中包含N个第一待处理图像;
将N个第一待处理图像分别输入到对应的编码层中,对N个第一待处理图像进行编码,获得第一样本图像的N个第一特征图;
其中,响应于目标编码层为N个编码层中的非第一个编码层,该目标编码层的输入还包括上一个编码层输出的第一特征图。
第一图像金字塔中的各个第一待处理图像的分辨率存在差异,第一图像金字塔中的每个 图像对应有一个侧输入路径,每个侧输入路径用以将对应的第一待处理图像输入到第一编码网络中对应的编码层中,如图7所示,第一图像金字塔750中包含N个第一待处理图像,每个第一待处理图像对应有一个侧输入路径,对于第一编码网络710中的非第一个编码层而言,其输入包含对应的侧输入路径输入的第一待处理图像,以及该编码层的上一个编码层输出的第一特征图。
相适应的,以输入第二编码网络720的图像为裁剪后的先验约束图像为例,对于第二编码网络720而言,获取第二特征图的过程包括:获取裁剪后的先验约束图像对应的第二图像金字塔,该第二图像金字塔是对裁剪后的先验约束图像按照该指定梯度进行下采样获取的图像集合,该第二图像金字塔中包含N个第二待处理图像;
将N个第二待处理图像分别输入第二编码网络中对应的编码层中,对N个第二待处理图像进行编码,获得裁剪后的先验约束图像的N个编码结果;
对N个编码结果进行合并,获得裁剪后的先验约束图像的第二特征图;
其中,当第二编码网络中的编码层为非第一个编码层时,该编码层的输入还包括上一个编码层输出的编码结果。
第二图像金字塔中的各个第二待处理图像的分辨率存在差异,第二图像金字塔中的每个第二待处理图像对应有一个侧输入路径,每个侧输入路径用以将对应的第二待处理图像输入到第二编码网络中对应的编码层中,如图7所示,第二图像金字塔760中包含N个第二待处理图像,每个第二待处理图像对应有一个侧输入路径,对于第二编码网络720中的非第一个编码层而言,其输入包含对应的侧输入路径输入的第二待处理图像,以及该编码层的上一个编码层输出的编码结果。
在本申请实施例中,编码网络(第一编码网络/第二编码网络)中的编码层的结构,可以采用两层的“3x3可分离卷积+ReLU激活函数+Dropout操作”的卷积层结构,图8示出了本申请一示例性实施例示出的编码层的结构示意图,如图8所示,编码网络中的编码层包含卷积层810以及卷积层820,两层卷积层中间使用残差连接的方式加入通道注意力模块830;在通道注意力模块830中,通过使用最大池化和平均池化在空间维度上压缩经过卷积层810获得的特征图;共享网络由多层感知器(Multi-Layer Perceptron,MLP)组成,通过对压缩后的特征图进行感知,串联合并,以及激活函数处理,获得通道注意力特征图;将通道注意力特征图与通道注意力模块的输入相乘,与编码网络的卷积层810的输出相加以形成残差结构,获得中间特征图,其后接卷积层820,用于使用指定步长的卷积层对中间特征图进行下采样,以获得卷积层820输出的特征图(第一特征图/编码结果),示意性的,该指定步长可以为2;可选的,为更好的提取输入图像的特征,可以将编码网络中的卷积层替换为深度可分离卷积层。
步骤540,对第一特征图以及第二特征图进行合并,获得综合特征图。
当图像处理模型为图7所示的基于蝴蝶型网络架构搭建的模型时,该综合特征图是第一编码网络710第N个编码层输出的第一特征图与第二编码网络720输出的第二特征图的合并结果。
步骤550,调用图像处理模块中的解码网络,基于综合特征图进行解码,获得第一样本图像的预测分割图像。
当图像处理模型为图7所示的基于蝴蝶型网络架构搭建的模型时,调用图像处理模块中的解码网络,基于综合特征图进行解码,获得第一样本图像的预测分割图像的过程可以实现为:
将N个第一特征图分别输入到解码网络中对应的解码层中,对N个第一特征图进行解码,获得N个解码结果;N个解码结果具有相同的分辨率;
对N个解码结果进行合并,获得第一样本图像的预测分割图像;
其中,响应于目标解码层为N个解码层中的非第一个解码层,该目标解码层的输入还包 括上一个解码层输出的解码结果。
在本申请实施例中,解码网络中的解码层的结构,可以采用两层的“3x3可分离卷积+ReLU激活函数+Dropout操作”的卷积层的结构,图9示出了本申请一示例性实施例示出的解码层的结构示意图,如图9所示,解码网络中的解码层包含卷积层910以及卷积层920,两层卷积层中间使用残差连接的方式加入控件注意力模块930,该空间注意力模块930主要关注位置信息;在空间注意力模块930中,通过使用最大池化和平均池化在通道维度上进行处理,获得一个特征图,再通过一个卷积层进行级联和卷积,再经过激活函数处理,获得空间注意力特征图;将空间注意力特征图与空间注意力模块的输入相乘,与解码网络的卷积层910的输出相加以形成残差结构,获得中间特征图,其后接卷积层920,用于使用指定步长的卷积层对中间特征图进行下采样,获得卷积层920输出的解码结果。
步骤560,调用图像处理模型中的生成网络,基于综合特征图生成预测生成图像。
如图7所示,该图像处理模型中可以包含生成网络740,用以与综合特征图生成预测生成图像741。
步骤570,基于预测分割图像与标签图像之间的差异,该预测生成图像与第二样本图像之间的差异,对图像处理模型进行训练;该第二样本图像是目标医学对象的第二模态的样本医学图像;该标签图像是与目标医学对象对应的,用以指示至少一个指定类型区域的图像。
本申请提供的蝴蝶形网络架构的图像处理模型,能够将深层语义信息和地层位置信息相结合,从而在保证网络宽度的同时,可以减轻梯度消失的情况,另一方面,通过对多尺度,多分辨率的输入图像的监督,能够获得更多的图像特征,进而获得更好的图像分割效果,和/或图像生成效果。
综上所述,本申请实施例提供的用于医学图像的图像处理方法,获取目标医学对象的多模态的样本医学图像,以及与目标医学图像对应的包含指定类型区域标签的标签图像,基于多模态的样本医学图像中的第一样本图像生成预测分割图像,以及预测生成图像,基于预测分割图像与标签图像之间的差异,预测生成图像与目标医学对象的第二样本图像之间的差异,对包含第一编码网络,解码网络以及生成网络的图像处理模型进行训练,从而使得训练获得图像处理模型能够基于单一模态的医学图像,获取到多模态的医学图像的特征,使得获得的医学图像分割结果包含的信息较为全面,提高了对医学图像的分割效果;
进一步的,基于训练获得的图像处理模型可以基于单一模态的医学图像,生成其他模态的医学图像,从而解决医学图像分析过程中的图像缺失问题。
在一种可能的实现方式中,在对图像处理模型进行训练时,可以结合两个图像处理模型的训练结果以获得最终的图像处理模型;示意性的,第一图像处理模型的输入为第一样本图像,该第一样本图像是目标医学对象的第一模态的样本医学图像,以目标医学对象的标签图像,以及第二样本图像为标签,对第一图像处理模型进行训练,获得训练好的第一图像处理模型,该第二样本图像是目标医学对象的第二模态的图样本医学图像;该第一图像处理模型用以生成输入的第一模态的医学图像的预测分割图像,和/或,生成输入的第一模态的医学对象的第二模态的医学生成图像;第二图像处理模型的输入为第二样本图像,以目标医学对象的标签图像,以及第一样本图像为标签,对第二图像处理模型进行训练,获得训练好的第二图像处理模型,该第二图像处理模型用于成输入的第二模态的医学图像的预测分割图像,和/或,生成输入的第二模态的医学对象的第一模态的医学生成图像。其中,若两个图像处理模型输入图像为同一医学对象不同模态的医学图像,基于第一图像处理模型和第二图像修理模型分别处理后获得的预测分割图像相同,或者误差在指定阈值范围内。
可选的,为减少网络参数,可以对第一图像处理模型的编码网络和解码网络的参数,与第二图像处理模型中的编码网络和解码网络的参数进行权值共享;该过程可以在模型训练过程中进行,或者,在模型训练完成后进行。示意性的,权值共享可以实现为:将其中一个图 像处理模型中编码网络和解码网络的参数替换到另一个图像处理模型的编码网络和解码网络中,或者,可以取两个图像处理模型中编码网络的参数的平均值和解码网络的参数的平均值,分别替换到两个图像处理模型的编码网络和解码网络中,本申请权值共享的方式不进行限制。
示意性的,以分割心肌瘢痕与心肌水肿为例,对基于本申请生成的图像处理模型的应用过程进行说明,图10示出了本申请一示例性实施例示出的图像处理模型的应用过程的示意图,该过程可以实现在部署有图像处理模型的终端或服务器,或者,部署有基于图像处理模型构建的图像分割模型的终端或服务器中,如图10所示,基于心脏磁共振技术,获取同一医疗对象的CMR图像,图10中为bSSFP图像,T2加权图像以及LGE图像;在第一阶段,将bSSFP图像1010输入到U-Net网络1020中,以获取U-Net网络输出的先验约束图像1030,该先验约束图像用以指示该医疗对象在CMR图像中的位置信息;基于该先验约束图像中医疗对象的中心位置,对该先验约束图像,T2加权图像,以LGE图像进行裁剪,将裁剪后的T2加权图像以及裁剪后的先验约束图像输入到对应T2模式的第一图像处理模型1040中,获得第一图像处理模型输出的第一预测分割图像1050,该第一预测分割图像包含心肌瘢痕的位置信息以及心肌水肿的位置信息;将裁剪后的LGE图像以及裁剪后的先验约束图像输入到对应LGE模式的第二图像处理模型1060中,获得第二图像处理模型输出的第二预测分割图像1070;为进一步提高预测分割图像的准确性,将第一预测分割图像与第二预测分割图像进行合并,获得该医学对象的心肌瘢痕与心肌水肿的分割图像1080;另外,当该过程实现在部署有图像处理模型的终端或服务器中时,可以基于T2加权图像生成对应的LGE图像,以及,基于LGE图像生成对应的T2加权图像。
图11示出了本申请一示例性实施例示出的用于医学图像的图像处理装置的方框图,如图11所示,该装置包括:
第一编码模块1110,用于调用图像处理模型中的第一编码网络,对第一样本图像进行编码,获得所述第一样本图像的第一特征图;所述第一样本图像是目标医学对象的第一模态的样本医学图像;
解码模块1120,用于调用所述图像处理模型中的解码网络,基于所述第一特征图进行解码,获得所述第一样本图像的预测分割图像;所述预测分割图像用以指示预测出的至少一个指定类型区域;
生成模块1130,用于调用所述图像处理模型中的生成网络,基于所述第一特征图生成预测生成图像;所述预测生成图像是所述第一样本图像的第二模态的预测图像;
模型训练模块1140,用于基于所述预测分割图像与所述标签图像之间的差异,所述预测生成图像与第二样本图像之间的差异,对所述图像处理模型进行训练;所述第二样本图像是所述目标医学对象的第二模态的样本医学图像;所述标签图像是与所述目标医学对象对应的,用以指示至少一个指定类型区域的图像。
在一种可能的实现方式中,所述模型训练模块1140,包括:
第一确定子模块,用于基于所述预测分割图像与所述标签图像之间的差异,确定第一损失函数的函数值;
第二确定子模块,用于基于所述预测生成图像与所述第二样本图像之间的差异,确定第二损失函数的函数值;
模型训练子模块,用于基于所述第一损失函数的函数值,以及所述第二损失函数的函数值,对所述图像处理模型进行训练。
在一种可能的实现方式中,所述模型训练子模块,用于基于所述第一损失函数的函数值,对所述第一编码网络的参数以及所述解码网络的参数进行更新;
基于所述第二损失函数的函数值,对所述第一编码网络的参数以及所述生成网络的参数进行更新。
在一种可能的实现方式中,所述第一确定子模块,包括:
第一确定单元,用于基于所述预测分割图像与所述标签图像的相似性,确定所述第一损失函数的第一分支函数的函数值;
第二确定单元,用于基于所述预测分割图像中预测出的至少一个指定类型区域的位置,与所述标签图像中的至少一个指定类型区域的位置,确定所述第一损失函数的第二分支函数的函数值;
第三确定单元,用于基于所述第一分支函数的函数值以及第二分支函数的函数值,确定所述第一损失函数的函数值。
在一种可能的实现方式中,所述第一确定单元,用于获取与所述预测分割图像中的各个划分区域分别对应的权重值;所述预测分割图像中的各个划分区域包含所述至少一个指定类型区域;
基于与所述预测分割图像中的各个划分区域分别对应的权重值,以及所述预测分割图像中的各个划分区域与所述标签图像中的各个划分区域的相似性,确定所述第一损失函数的第一分支函数的函数值。
在一种可能的实现方式中,所述装置还包括:
判别模块,用于调用判别器对所述预测生成图像进行判别,获得所述预测生成图像的判别结果;
第三确定模块,用于基于所述判别结果,确定第三损失函数的函数值;所述判别结果用以指示所述预测生成图像是否为真实图像;
所述模型训练模块1140,用于基于所述第一损失函数的函数值,所述第二损失函数的函数值以及所述第三损失函数的函数值,对所述图像处理模型进行训练。
在一种可能的实现方式中,所述第一编码网络包含N个编码层,且所述N个编码层两两相连,N≥2,且为正整数;
所述第一编码模块1110,包括:
集合获取子模块,用于获取所述第一样本图像的第一图像金字塔,所述第一图像金字塔是对所述第一样本图像按照指定梯度下采样获取的图像集合,所述第一图像金字塔中包含N个第一待处理图像;
编码子模块,用于将所述N个第一待处理图像分别输入到对应的编码层中,对所述N个第一待处理图像进行编码,获得所述第一样本图像的N个第一特征图;
其中,响应于目标编码层为所述N个编码层中的非第一个编码层,所述目标编码层的输入还包括上一个编码层输出的第一特征图。
在一种可能的实现方式中,所述图像处理模型中的解码网络包含N个解码层,且所述N个解码层两两相连,所述N个解码层与所述N个编码层一一对应;
所述解码模块1120,包括:
解码子模块,用于将所述N个第一特征图分别输入到对应的解码层中,对所述N个第一特征图进行解码,获得N个解码结果;所述N个解码结果具有相同的分辨率;
合并子模块,用于对所述N个解码结果进行合并,获得所述第一样本图像的预测分割图像;
其中,响应于目标解码层为所述N个解码层中的非第一个解码层,所述目标解码层的输入还包括上一个解码层输出的解码结果。
在一种可能的实现方式中,所述装置还包括:
图像获取模块,用于基于第三样本图像,获取所述图像处理模型的先验约束图像;所述第三样本图像是所述目标医学对象的第三模态的样本医学图像;所述先验约束图像用以指示所述目标医学对象在所述第三样本图像中的位置;
第二编码模块,用于调用所述图像处理模型中的第二编码网络,基于所述先验约束图像进行编码,获得所述第三样本图像的第二特征图;
合并模块,用于对所述第一特征图以及所述第二特征图进行合并,获得综合特征图;
所述解码模块1120,用于调用所述图像处理模块中的解码网络,基于所述综合特征图进行解码,获得所述第一样本图像的所述预测分割图像;
所述生成模块1130,用于调用所述图像处理模型中的生成网络,基于所述综合特征图生成所述预测生成图像。
在一种可能的实现方式中,所述装置还包括:
裁剪模块,用于基于所述目标医学对象的位置,对所述先验约束图像进行裁剪;
所述第二编码模块,用于调用所述图像处理模型中的第二编码网络,对裁剪后的所述先验约束图像进行编码,获得所述第三样本图像的第二特征图。
在一种可能的实现方式中,所述图像获取模块,用于调用语义分割网络,对所述第三样本图像进行处理,获取所述图像处理模型的先验约束图像。
在一种可能的实现方式中,所述第二编码网络中的参数与所述第一编码网络中的参数权值共享。
综上所述,本申请实施例提供的用于医学图像的图像处理装置,通过获取目标医学对象的多模态的样本医学图像,以及与目标医学图像对应的包含指定类型区域标签的标签图像,基于多模态的样本医学图像中的第一样本图像生成预测分割图像,以及预测生成图像,基于预测分割图像与标签图像之间的差异,预测生成图像与目标医学对象的第二样本图像之间的差异,对包含第一编码网络,解码网络以及生成网络的图像处理模型进行训练,从而使得训练获得图像处理模型能够基于单一模态的医学图像,获取到多模态的医学图像的特征,使得获得的医学图像分割结果包含的信息较为全面,提高了对医学图像的分割效果;
进一步的,基于训练获得的图像处理模型可以基于单一模态的医学图像,生成其他模态的医学图像,从而解决医学图像分析过程中的图像缺失问题。
图12示出了本申请一示例性实施例示出的计算机设备1200的结构框图。该计算机设备可以实现为本申请上述方案中的服务器。所述计算机设备1200包括中央处理单元(Central Processing Unit,CPU)1201、包括随机存取存储器(Random Access Memory,RAM)1202和只读存储器(Read-Only Memory,ROM)1203的***存储器1204,以及连接***存储器1204和中央处理单元1201的***总线1205。所述计算机设备1200还包括用于存储操作***1209、应用程序1210和其他程序模块1211的大容量存储设备1206。
所述大容量存储设备1206通过连接到***总线1205的大容量存储控制器(未示出)连接到中央处理单元1201。所述大容量存储设备1206及其相关联的计算机可读介质为计算机设备1200提供非易失性存储。也就是说,所述大容量存储设备1206可以包括诸如硬盘或者只读光盘(Compact Disc Read-Only Memory,CD-ROM)驱动器之类的计算机可读介质(未示出)。
不失一般性,所述计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、可擦除可编程只读寄存器(Erasable Programmable Read Only Memory,EPROM)、电子抹除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)闪存或其他固态存储其技术,CD-ROM、数字多功能光盘(Digital Versatile Disc,DVD)或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知所述计算机存储介质不局限于上述几种。上述的***存储器1204和大容量存储设备1206可以统称为存储器。
根据本公开的各种实施例,所述计算机设备1200还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即计算机设备1200可以通过连接在所述***总线1205上的网络接口单元1207连接到网络1208,或者说,也可以使用网络接口单元1207来连接到其他类型的网络或远程计算机***(未示出)。
所述存储器还包括至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、至少一段程序、代码集或指令集存储于存储器中,中央处理器1201通过执行该至少一条指令、至少一段程序、代码集或指令集来实现上述各个实施例所示的用于医学图像的图像处理方法中的全部或部分步骤。
图13示出了本申请一个示例性实施例提供的计算机设备1300的结构框图。该计算机设备1300可以实现为上述的终端,比如:智能手机、平板电脑、笔记本电脑或台式电脑。计算机设备1300还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。
通常,计算机设备1300包括有:处理器1301和存储器1302。
处理器1301可以包括一个或多个处理核心,比如4核心处理器、13核心处理器等。处理器1301可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器1301也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器1301可以集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器1301还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器1302可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器1302还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器1302中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器1301所执行以实现本申请中方法实施例提供的用于医学图像的图像处理方法中的全部或部分步骤。
在一些实施例中,计算机设备1300还可选包括有:***设备接口1303和至少一个***设备。处理器1301、存储器1302和***设备接口1303之间可以通过总线或信号线相连。各个***设备可以通过总线、信号线或电路板与***设备接口1303相连。具体地,***设备包括:射频电路1304、显示屏1305、摄像头组件1306、音频电路1307和电源1309中的至少一种。
***设备接口1303可被用于将I/O(Input/Output,输入/输出)相关的至少一个***设备连接到处理器1301和存储器1302。在一些实施例中,处理器1301、存储器1302和***设备接口1303被集成在同一芯片或电路板上;在一些其他实施例中,处理器1301、存储器1302和***设备接口1303中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。
在一些实施例中,计算机设备1300还包括有一个或多个传感器1310。该一个或多个传感器1310包括但不限于:加速度传感器1311、陀螺仪传感器1312、压力传感器1313、光学传感器1315以及接近传感器1316。
本领域技术人员可以理解,图13中示出的结构并不构成对计算机设备1300的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
在一示例性实施例中,还提供了一种计算机可读存储介质,用于存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现上述用于医学图像的图像处理方法中的全部或部分步骤。例 如,该计算机可读存储介质可以是只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)、磁带、软盘和光数据存储设备等。
在一示例性实施例中,还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述图2、图4或图5任一实施例所示方法的全部或部分步骤。

Claims (20)

  1. 一种用于医学图像的图像处理方法,所述方法由计算机设备执行,所述方法包括:
    调用图像处理模型中的第一编码网络,对第一样本图像进行编码,获得所述第一样本图像的第一特征图;所述第一样本图像是目标医学对象的第一模态的样本医学图像;
    调用所述图像处理模型中的解码网络,基于所述第一特征图进行解码,获得所述第一样本图像的预测分割图像;所述预测分割图像用以指示预测出的至少一个指定类型区域;
    调用所述图像处理模型中的生成网络,基于所述第一特征图生成预测生成图像;所述预测生成图像是所述第一样本图像的第二模态的预测图像;
    基于所述预测分割图像与标签图像之间的差异,以及,所述预测生成图像与第二样本图像之间的差异,对所述图像处理模型进行训练;所述第二样本图像是所述目标医学对象的第二模态的样本医学图像;所述标签图像是与所述目标医学对象对应的,用以指示至少一个指定类型区域的图像。
  2. 根据权利要求1所述的方法,所述基于预测分割图像与所述标签图像之间的差异,以及,所述预测生成图像与所述第二样本图像之间的差异,对所述图像处理模型进行训练,包括:
    基于所述预测分割图像与所述标签图像之间的差异,确定第一损失函数的函数值;
    基于所述预测生成图像与所述第二样本图像之间的差异,确定第二损失函数的函数值;
    基于所述第一损失函数的函数值,以及所述第二损失函数的函数值,对所述图像处理模型进行训练。
  3. 根据权利要求2所述的方法,所述基于所述第一损失函数的函数值,以及所述第二损失函数的函数值,对所述图像处理模型进行训练,包括:
    基于所述第一损失函数的函数值,对所述第一编码网络的参数以及所述解码网络的参数进行更新;
    基于所述第二损失函数的函数值,对所述第一编码网络的参数以及所述生成网络的参数进行更新。
  4. 根据权利要求2所述的方法,所述基于所述预测分割图像与所述标签图像之间的差异,确定第一损失函数的函数值,包括:
    基于所述预测分割图像与所述标签图像的相似性,确定所述第一损失函数的第一分支函数的函数值;
    基于所述预测分割图像中预测出的至少一个指定类型区域的位置,与所述标签图像中的至少一个指定类型区域的位置,确定所述第一损失函数的第二分支函数的函数值;
    基于所述第一分支函数的函数值以及所述第二分支函数的函数值,确定所述第一损失函数的函数值。
  5. 根据权利要求4所述的方法,基于所述预测分割图像与所述标签图像的相似性,确定所述第一损失函数的第一分支函数的函数值,包括:
    获取与所述预测分割图像中的各个划分区域分别对应的权重值;所述预测分割图像中的各个划分区域包含所述至少一个指定类型区域;
    基于与所述预测分割图像中的各个划分区域分别对应的权重值,以及所述预测分割图像中的各个划分区域与所述标签图像中的各个划分区域的相似性,确定所述第一损失函数的第一分支函数的函数值。
  6. 根据权利要求2所述的方法,所述方法还包括:
    调用判别器对所述预测生成图像进行判别,获得所述预测生成图像的判别结果;
    基于所述判别结果,确定第三损失函数的函数值;所述判别结果用以指示所述预测生成图像是否为真实图像;
    所述基于所述第一损失函数的函数值,以及所述第二损失函数的函数值,对所述图像处理模型进行训练,包括:
    基于所述第一损失函数的函数值,所述第二损失函数的函数值以及所述第三损失函数的函数值,对所述图像处理模型进行训练。
  7. 根据权利要求1所述的方法,所述第一编码网络包含N个编码层,且所述N个编码层两两相连,N≥2,且为正整数;
    所述调用图像处理模型中的第一编码网络,对第一样本图像进行编码,获得所述第一样本图像的第一特征图,包括:
    获取所述第一样本图像的第一图像金字塔,所述第一图像金字塔是对所述第一样本图像按照指定梯度下采样获取的图像集合,所述第一图像金字塔中包含N个第一待处理图像;
    将所述N个第一待处理图像分别输入到对应的编码层中,对所述N个第一待处理图像进行编码,获得所述第一样本图像的N个第一特征图;
    其中,响应于目标编码层为所述N个编码层中的非第一个编码层,所述目标编码层的输入还包括上一个编码层输出的第一特征图。
  8. 根据权利要求7所述的方法,所述图像处理模型中的解码网络包含N个解码层,且所述N个解码层两两相连,所述N个解码层与所述N个编码层一一对应;
    所述调用所述图像处理模型中的解码网络,基于所述第一特征图进行解码,获得所述第一样本图像的预测分割图像,包括:
    将所述N个第一特征图分别输入到对应的解码层中,对所述N个第一特征图进行解码,获得N个解码结果;所述N个解码结果具有相同的分辨率;
    对所述N个解码结果进行合并,获得所述第一样本图像的预测分割图像;
    其中,响应于目标解码层为所述N个解码层中的非第一个解码层,所述目标解码层的输入还包括上一个解码层输出的解码结果。
  9. 根据权利要求1所述的方法,所述方法还包括:
    基于第三样本图像,获取所述图像处理模型的先验约束图像;所述第三样本图像是所述目标医学对象的第三模态的样本医学图像;所述先验约束图像用以指示所述目标医学对象在所述第三样本图像中的位置;
    调用所述图像处理模型中的第二编码网络,基于所述先验约束图像进行编码,获得所述第三样本图像的第二特征图;
    对所述第一特征图以及所述第二特征图进行合并,获得综合特征图;
    所述调用所述图像处理模型中的解码网络,基于所述第一特征图进行解码,获得所述第一样本图像的预测分割图像,包括:
    调用所述图像处理模块中的解码网络,基于所述综合特征图进行解码,获得所述第一样本图像的所述预测分割图像;
    所述调用所述图像处理模型中的生成网络,基于所述第一特征图生成预测生成图像,包括:
    调用所述图像处理模型中的生成网络,基于所述综合特征图生成所述预测生成图像。
  10. 根据权利要求9所述的方法,在调用所述图像处理模型中的第二编码网络,基于所述先验约束图像进行编码,获得所述第三样本图像的第二特征图之前,还包括:
    基于所述目标医学对象的位置,对所述先验约束图像进行裁剪;
    所述调用所述图像处理模型中的第二编码网络,基于所述先验约束图像进行编码,获得所述第三样本图像的第二特征图,包括:
    调用所述图像处理模型中的第二编码网络,对裁剪后的所述先验约束图像进行编码,获得所述第三样本图像的第二特征图。
  11. 根据权利要求9所述的方法,所述基于第三样本图像,获取所述图像处理模型的先验约束图像,包括:
    调用语义分割网络,对所述第三样本图像进行处理,获取所述图像处理模型的先验约束图像;所述语义分割网络是基于样本图像集训练获得的;所述样本图像集中包含第四样本图像,以及所述第四样本图像的近似标记,所述第四样本图像是指所述目标医学对象之外的其他医学对象的第三模态的样本医学图像;所述近似标记用以指示所述第四样本图像中的所述其他医学对象的位置。
  12. 根据权利要求9所述的方法,所述第二编码网络中的参数与所述第一编码网络中的参数权值共享。
  13. 一种用于医学图像的图像处理装置,所述装置包括:
    第一编码模块,用于调用图像处理模型中的第一编码网络,对第一样本图像进行编码,获得所述第一样本图像的第一特征图;所述第一样本图像是目标医学对象的第一模态的样本医学图像;
    解码模块,用于调用所述图像处理模型中的解码网络,基于所述第一特征图进行解码,获得所述第一样本图像的预测分割图像;所述预测分割图像用以指示预测出的至少一个指定类型区域;
    生成模块,用于调用所述图像处理模型中的生成网络,基于所述第一特征图生成预测生成图像;所述预测生成图像是所述第一样本图像的第二模态的预测图像;
    模型训练模块,用于基于所述预测分割图像与所述标签图像之间的差异,所述预测生成图像与第二样本图像之间的差异,对所述图像处理模型进行训练;所述第二样本图像是所述目标医学对象的第二模态的样本医学图像;所述标签图像是与所述目标医学对象对应的,用以指示至少一个指定类型区域的图像。
  14. 根据权利要求13所述的装置,所述模型训练模块,包括:
    第一确定子模块,用于基于所述预测分割图像与所述标签图像之间的差异,确定第一损失函数的函数值;
    第二确定子模块,用于基于所述预测生成图像与所述第二样本图像之间的差异,确定第二损失函数的函数值;
    模型训练子模块,用于基于所述第一损失函数的函数值,以及所述第二损失函数的函数值,对所述图像处理模型进行训练。
  15. 根据权利要求14所述的装置,所述模型训练子模块,用于,
    基于所述第一损失函数的函数值,对所述第一编码网络的参数以及所述解码网络的参数进行更新;
    基于所述第二损失函数的函数值,对所述第一编码网络的参数以及所述生成网络的参数进行更新。
  16. 根据权利要求14所述的装置,所述第一确定子模块,包括:
    第一确定单元,用于基于所述预测分割图像与所述标签图像的相似性,确定所述第一损失函数的第一分支函数的函数值;
    第二确定单元,用于基于所述预测分割图像中预测出的至少一个指定类型区域的位置,与所述标签图像中的至少一个指定类型区域的位置,确定所述第一损失函数的第二分支函数的函数值;
    第三确定单元,用于基于所述第一分支函数的函数值以及所述第二分支函数的函数值,确定所述第一损失函数的函数值。
  17. 根据权利要求16所述的装置,基于第一确定单元,用于,
    获取与所述预测分割图像中的各个划分区域分别对应的权重值;所述预测分割图像中的各个划分区域包含所述至少一个指定类型区域;
    基于与所述预测分割图像中的各个划分区域分别对应的权重值,以及所述预测分割图像中的各个划分区域与所述标签图像中的各个划分区域的相似性,确定所述第一损失函数的第一分支函数的函数值。
  18. 一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如权利要求1至12任一所述的用于医学图像的图像处理方法。
  19. 一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条计算机程序,所述计算机程序由处理器加载并执行以实现如权利要求1至12任一所述的用于医学图像的图像处理方法。
  20. 一种计算机程序产品,所述计算机程序产品包括计算机指令,所述计算机指令存储在计算机可读存储介质中;所述计算机指令由计算机设备的处理器从所述计算机可读存储介质中读取并执行,以使得所述计算机设备实现如权利要求1至12任一所述的用于医学图像的图像处理方法。
PCT/CN2022/107341 2021-08-16 2022-07-22 用于医学图像的图像处理方法、装置、设备及存储介质 WO2023020198A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/132,824 US20230245426A1 (en) 2021-08-16 2023-04-10 Image processing method and apparatus for medical image, device and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110938701.XA CN114283151A (zh) 2021-08-16 2021-08-16 用于医学图像的图像处理方法、装置、设备及存储介质
CN202110938701.X 2021-08-16

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/132,824 Continuation US20230245426A1 (en) 2021-08-16 2023-04-10 Image processing method and apparatus for medical image, device and storage medium

Publications (1)

Publication Number Publication Date
WO2023020198A1 true WO2023020198A1 (zh) 2023-02-23

Family

ID=80868460

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/107341 WO2023020198A1 (zh) 2021-08-16 2022-07-22 用于医学图像的图像处理方法、装置、设备及存储介质

Country Status (3)

Country Link
US (1) US20230245426A1 (zh)
CN (1) CN114283151A (zh)
WO (1) WO2023020198A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402812A (zh) * 2023-06-07 2023-07-07 江西业力医疗器械有限公司 一种医疗用影像数据的处理方法及***
CN116433795A (zh) * 2023-06-14 2023-07-14 之江实验室 基于对抗生成网络的多模态影像生成方法和装置
CN116580037A (zh) * 2023-07-10 2023-08-11 天津医科大学第二医院 一种基于深度学习的鼻咽癌图像分割方法及***

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114283151A (zh) * 2021-08-16 2022-04-05 腾讯科技(深圳)有限公司 用于医学图像的图像处理方法、装置、设备及存储介质
CN114494251B (zh) * 2022-04-06 2022-07-15 南昌睿度医疗科技有限公司 Spect图像处理方法以及相关设备
CN115115575A (zh) * 2022-04-27 2022-09-27 腾讯医疗健康(深圳)有限公司 一种图像检测方法、装置、计算机设备及存储介质
CN114708436B (zh) * 2022-06-02 2022-09-02 深圳比特微电子科技有限公司 语义分割模型的训练方法、语义分割方法、装置和介质
WO2024077738A1 (en) * 2022-10-13 2024-04-18 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Learned image compression based on fast residual channel attention network
CN117036181A (zh) * 2022-10-24 2023-11-10 腾讯科技(深圳)有限公司 图像处理模型的训练方法、装置、电子设备及存储介质
CN116778021B (zh) * 2023-08-22 2023-11-07 北京大学 医学图像生成方法、装置、电子设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180336677A1 (en) * 2017-05-18 2018-11-22 Toshiba Medical Systems Corporation Apparatus and method for medical image processing
CN109754447A (zh) * 2018-12-28 2019-05-14 上海联影智能医疗科技有限公司 图像生成方法、装置、设备和存储介质
CN110544275A (zh) * 2019-08-19 2019-12-06 中山大学 生成配准的带病灶分割标签的多模态mri的方法、***及介质
CN111145147A (zh) * 2019-12-14 2020-05-12 中国科学院深圳先进技术研究院 多模态医学图像的分割方法及终端设备
CN112669247A (zh) * 2020-12-09 2021-04-16 深圳先进技术研究院 一种用于多任务医学图像合成的先验指导型网络
CN114283151A (zh) * 2021-08-16 2022-04-05 腾讯科技(深圳)有限公司 用于医学图像的图像处理方法、装置、设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180336677A1 (en) * 2017-05-18 2018-11-22 Toshiba Medical Systems Corporation Apparatus and method for medical image processing
CN109754447A (zh) * 2018-12-28 2019-05-14 上海联影智能医疗科技有限公司 图像生成方法、装置、设备和存储介质
CN110544275A (zh) * 2019-08-19 2019-12-06 中山大学 生成配准的带病灶分割标签的多模态mri的方法、***及介质
CN111145147A (zh) * 2019-12-14 2020-05-12 中国科学院深圳先进技术研究院 多模态医学图像的分割方法及终端设备
CN112669247A (zh) * 2020-12-09 2021-04-16 深圳先进技术研究院 一种用于多任务医学图像合成的先验指导型网络
CN114283151A (zh) * 2021-08-16 2022-04-05 腾讯科技(深圳)有限公司 用于医学图像的图像处理方法、装置、设备及存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402812A (zh) * 2023-06-07 2023-07-07 江西业力医疗器械有限公司 一种医疗用影像数据的处理方法及***
CN116402812B (zh) * 2023-06-07 2023-09-19 江西业力医疗器械有限公司 一种医疗用影像数据的处理方法及***
CN116433795A (zh) * 2023-06-14 2023-07-14 之江实验室 基于对抗生成网络的多模态影像生成方法和装置
CN116433795B (zh) * 2023-06-14 2023-08-29 之江实验室 基于对抗生成网络的多模态影像生成方法和装置
CN116580037A (zh) * 2023-07-10 2023-08-11 天津医科大学第二医院 一种基于深度学习的鼻咽癌图像分割方法及***
CN116580037B (zh) * 2023-07-10 2023-10-13 天津医科大学第二医院 一种基于深度学习的鼻咽癌图像分割方法及***

Also Published As

Publication number Publication date
US20230245426A1 (en) 2023-08-03
CN114283151A (zh) 2022-04-05

Similar Documents

Publication Publication Date Title
WO2023020198A1 (zh) 用于医学图像的图像处理方法、装置、设备及存储介质
US9858665B2 (en) Medical imaging device rendering predictive prostate cancer visualizations using quantitative multiparametric MRI models
CN109978037B (zh) 图像处理方法、模型训练方法、装置、和存储介质
Zhao et al. Craniomaxillofacial bony structures segmentation from MRI with deep-supervision adversarial learning
CN111369562B (zh) 图像处理方法、装置、电子设备及存储介质
US11526994B1 (en) Labeling, visualization, and volumetric quantification of high-grade brain glioma from MRI images
US11514621B2 (en) Low-dose image reconstruction method and system based on prior anatomical structure difference
Yang et al. A deep learning segmentation approach in free‐breathing real‐time cardiac magnetic resonance imaging
CN112396605B (zh) 网络训练方法及装置、图像识别方法和电子设备
Ming et al. Deep learning-based multimodal image analysis for cervical cancer detection
JP2022511965A (ja) 人工神経網を利用する臓器の体積測定方法及びその装置
Zimmer et al. Towards whole placenta segmentation at late gestation using multi-view ultrasound images
Yang et al. Generative Adversarial Networks (GAN) Powered Fast Magnetic Resonance Imaging--Mini Review, Comparison and Perspectives
US10176569B2 (en) Multiple algorithm lesion segmentation
US11348242B2 (en) Prediction apparatus, prediction method, prediction program
CN113192031B (zh) 血管分析方法、装置、计算机设备和存储介质
Souid et al. Xception-ResNet autoencoder for pneumothorax segmentation
Morales et al. Present and future innovations in AI and cardiac MRI
CN113822323A (zh) 脑部扫描图像的识别处理方法、装置、设备及存储介质
KR100450278B1 (ko) 의료영상 처리 시스템 및 처리방법
WO2019044078A1 (ja) 脳画像正規化装置、方法およびプログラム
Wang et al. Investigation of probability maps in deep-learning-based brain ventricle parcellation
CN115965785A (zh) 图像分割方法、装置、设备、程序产品及介质
CN114723723A (zh) 医学影像处理方法、计算机设备和存储介质
Kerfoot et al. Automated CNN-based reconstruction of short-axis cardiac MR sequence from real-time image data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22857511

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE