CN117111696A

CN117111696A - Medical image segmentation method and training method of medical image segmentation model

Info

Publication number: CN117111696A
Application number: CN202311153927.4A
Authority: CN
Inventors: 石一磊; 郑子璇; 胡敬良; 牟立超; 侯雨; 陈咏虹
Original assignee: Maide Intelligent Technology Wuxi Co ltd
Current assignee: Maide Intelligent Technology Wuxi Co ltd
Priority date: 2023-09-07
Filing date: 2023-09-07
Publication date: 2023-11-24

Abstract

The application provides a medical image segmentation method and a training method of a medical image segmentation model, wherein the medical image segmentation method comprises the following steps: acquiring a medical image to be processed; inputting the medical image to be processed into a medical image segmentation model trained in advance to obtain a segmentation result output by the medical image segmentation model; the medical image segmentation model comprises a natural image segmentation module, wherein the natural image segmentation module comprises an image encoder, an image decoder and a segmentation adapter, the segmentation adapter is embedded into the image encoder, and the image encoder is trained by updating parameters of the segmentation adapter. Therefore, the segmentation adapter can be introduced on the basis of the natural image segmentation module, so that the backbone network (namely the image encoder) can be frozen in the process of training the medical image segmentation model, and only a small amount of medical segmentation image data is utilized to train the segmentation adapter, so that the trained medical image segmentation model is obtained.

Description

Medical image segmentation method and training method of medical image segmentation model

Technical Field

The application relates to the technical field of image processing, in particular to a medical image segmentation method and a training method of a medical image segmentation model.

Background

With the development of medical technology, the segmentation precision of the medical image segmentation model is higher and higher. Taking thyroid cancer as an example, in recent years, the incidence of thyroid cancer is rapidly rising, and global cancer investigation results in 2020 show that the incidence of thyroid cancer has distribution characteristics that women are more than men and cities are higher than rural areas.

The ultrasonic is used as a first-choice examination method of thyroid lesions, can be used for primarily judging biological behaviors of the thyroid lesions while finding the lesions, and has the advantages of convenience, safety and the like. The rapid development of technology has prompted the widespread application of artificial intelligence (Artificial Intelligence, AI) technology in ultrasound imaging of medical big data, with obvious advantages. Under the conditions of daily overload workload and complex high-risk examination pressure, the ultrasonic AI system can optimize the examination flow, standardize the diagnosis standard, shorten the examination and report time, and remarkably improve the diagnosis confidence and the working efficiency of the ultrasonic doctor. Thyroid nodule ultrasonic image segmentation based on artificial intelligence further assists doctors in locating nodule positions more rapidly and confirms nodule morphology so as to facilitate judging benign and malignant nodules. The technology can be expected to have wide innovation and development prospects in the aspects of future power-assisted ultrasonic diagnosis and treatment technology, talent culture and the like.

Therefore, a thyroid nodule ultrasound image segmentation method with high accuracy is needed, so that a doctor can be assisted in performing rapid and accurate diagnosis in an examination. Other medical images are pushed, similar to thyroid image nodule ultrasound images, and a segmentation method with higher accuracy is also needed.

In the prior art, the accuracy of segmentation is generally improved mainly by improving the architecture of a medical image segmentation model or performing data enhancement. However, due to the privacy of medical data and the expertise of data labeling, large-scale and high-quality medical segmentation image data is lacking, so that the precision of a segmentation model is low.

Disclosure of Invention

The embodiment of the application aims to provide a medical image segmentation method and a medical image segmentation model training method, which are used for solving the technical problem that the precision of a segmentation model is lower due to the fact that large-scale high-quality medical segmentation image data is lacking due to the privacy of medical data and the specialty of data labeling in the prior art.

In a first aspect, an embodiment of the present application provides a medical image segmentation method, including: acquiring a medical image to be processed; inputting the medical image to be processed into a medical image segmentation model trained in advance to obtain a segmentation result output by the medical image segmentation model; the medical image segmentation model comprises a natural image segmentation module, wherein the natural image segmentation module comprises an image encoder, an image decoder and a segmentation adapter, the segmentation adapter is embedded into the image encoder, and training of the image encoder is achieved by updating parameters of the segmentation adapter.

In the above scheme, the medical image to be processed can be segmented by using the medical image segmentation model trained in advance, so that a corresponding segmentation result is obtained. The medical image segmentation model comprises a natural image segmentation module, and a segmentation adapter can be introduced based on the natural image segmentation module, so that a backbone network (namely an image encoder) can be frozen in the process of training the medical image segmentation model, and only a small amount of medical segmentation image data is used for training the segmentation adapter, so that a trained medical image segmentation model is obtained. Therefore, even if large-scale high-quality medical segmentation image data is lacking, a medical image segmentation model with high segmentation accuracy can be obtained through training in the above manner.

In an optional embodiment, the inputting the medical image to be processed into a pre-trained medical image segmentation model to obtain a segmentation result output by the medical image segmentation model includes: inputting the medical image to be processed into the image encoder to obtain an encoding result output by the image encoder; and inputting the coding result into the image decoder to obtain the segmentation result output by the image decoder. In the above scheme, the natural image segmentation module may be used to segment the medical image to be processed, so as to obtain a corresponding segmentation result. The natural image segmentation module has strong segmentation capability, so that the whole medical image segmentation model has a high-precision segmentation effect in the medical image.

In an alternative embodiment, the medical image segmentation method further comprises: training the medical image segmentation model by using the following steps: acquiring medical segmentation image data and a medical image segmentation model to be trained; loading corresponding pre-training parameters for the natural image segmentation module; and updating parameters of the segmentation adapter and the image decoder in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model. In the above scheme, the image encoder may be frozen during the training of the medical image segmentation model, and only a small amount of medical segmentation image data is used to train the segmentation adapter and simultaneously train the image decoder, thereby obtaining a trained medical image segmentation model. Therefore, even if large-scale high-quality medical segmentation image data is lacking, a medical image segmentation model with high segmentation accuracy can be obtained through training in the above manner.

In an alternative embodiment, the medical image segmentation model further comprises: a spatial multi-scale information feature extraction module; inputting the medical image to be processed into a medical image segmentation model trained in advance to obtain a segmentation result output by the medical image segmentation model, wherein the method comprises the following steps: inputting the medical image to be processed into the spatial multi-scale information feature extraction module to obtain feature data output by the spatial multi-scale information feature extraction module; inputting the medical image to be processed and the characteristic data into the image encoder to obtain an encoding result output by the image encoder; and inputting the coding result into the image decoder to obtain the segmentation result output by the image decoder. In the scheme, a spatial multi-scale information feature extraction module can be introduced, and before the medical image to be processed is subjected to image segmentation, multi-scale spatial information is transmitted to the backbone network based on the medical image to be processed, so that the segmentation accuracy of the medical image segmentation model obtained through training is improved.

In an alternative embodiment, the medical image segmentation method further comprises: training the medical image segmentation model by using the following steps: acquiring medical segmentation image data and a medical image segmentation model to be trained; loading corresponding pre-training parameters for the natural image segmentation module; and updating parameters of the segmentation adapter, the image decoder and the spatial multi-scale information feature extraction module in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model. In the above scheme, in the process of training the medical image segmentation model, the image encoder can be frozen, only a small amount of medical segmentation image data is used for training the segmentation adapter, and the image decoder and the spatial multi-scale information feature extraction module are trained, so that the trained medical image segmentation model is obtained. Therefore, even if large-scale high-quality medical segmentation image data is lacking, a medical image segmentation model with high segmentation accuracy can be obtained through training in the above manner.

In a second aspect, an embodiment of the present application provides a training method for a medical image segmentation model, including: acquiring medical segmentation image data and a medical image segmentation model to be trained, wherein the medical image segmentation model comprises a natural image segmentation module, the natural image segmentation module comprises an image encoder, an image decoder and a segmentation adapter, and the segmentation adapter is embedded in the image encoder; loading corresponding pre-training parameters for the natural image segmentation module; and updating parameters of the segmentation adapter and the image decoder in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

In the above scheme, the medical image segmentation model includes a natural image segmentation module, and the segmentation adapter can be introduced based on the natural image segmentation module, so that the backbone network (i.e., the image encoder) can be frozen in the process of training the medical image segmentation model, and only a small amount of medical segmentation image data is used for training the segmentation adapter, so that a trained medical image segmentation model is obtained. Therefore, even if large-scale high-quality medical segmentation image data is lacking, a medical image segmentation model with high segmentation accuracy can be obtained through training in the above manner.

In an alternative embodiment, the medical image segmentation model further comprises: the spatial multi-scale information feature extraction module updates parameters of the segmentation adapter and the image decoder in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model, and the spatial multi-scale information feature extraction module comprises: and updating parameters of the segmentation adapter, the image decoder and the spatial multi-scale information feature extraction module in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

In a third aspect, an embodiment of the present application provides a medical image segmentation apparatus, including: the first acquisition module is used for acquiring a medical image to be processed; the input module is used for inputting the medical image to be processed into a medical image segmentation model trained in advance to obtain a segmentation result output by the medical image segmentation model; the medical image segmentation model comprises a natural image segmentation module, wherein the natural image segmentation module comprises an image encoder, an image decoder and a segmentation adapter, the segmentation adapter is embedded into the image encoder, and training of the image encoder is achieved by updating parameters of the segmentation adapter.

In an alternative embodiment, the input module is specifically configured to: inputting the medical image to be processed into the image encoder to obtain an encoding result output by the image encoder; and inputting the coding result into the image decoder to obtain the segmentation result output by the image decoder. In the above scheme, the natural image segmentation module may be used to segment the medical image to be processed, so as to obtain a corresponding segmentation result. The natural image segmentation module has strong segmentation capability, so that the whole medical image segmentation model has a high-precision segmentation effect in the medical image.

In an alternative embodiment, the medical image segmentation apparatus further comprises: the first training module is used for training the medical image segmentation model by the following steps: acquiring medical segmentation image data and a medical image segmentation model to be trained; loading corresponding pre-training parameters for the natural image segmentation module; and updating parameters of the segmentation adapter and the image decoder in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model. In the above scheme, the image encoder may be frozen during the training of the medical image segmentation model, and only a small amount of medical segmentation image data is used to train the segmentation adapter and simultaneously train the image decoder, thereby obtaining a trained medical image segmentation model. Therefore, even if large-scale high-quality medical segmentation image data is lacking, a medical image segmentation model with high segmentation accuracy can be obtained through training in the above manner.

In an alternative embodiment, the medical image segmentation model further comprises: a spatial multi-scale information feature extraction module; the input module is specifically used for: inputting the medical image to be processed into the spatial multi-scale information feature extraction module to obtain feature data output by the spatial multi-scale information feature extraction module; inputting the medical image to be processed and the characteristic data into the image encoder to obtain an encoding result output by the image encoder; and inputting the coding result into the image decoder to obtain the segmentation result output by the image decoder. In the scheme, a spatial multi-scale information feature extraction module can be introduced, and before the medical image to be processed is subjected to image segmentation, multi-scale spatial information is transmitted to the backbone network based on the medical image to be processed, so that the segmentation accuracy of the medical image segmentation model obtained through training is improved.

In an alternative embodiment, the medical image segmentation apparatus further comprises: the second training module is used for training the medical image segmentation model by the following steps: acquiring medical segmentation image data and a medical image segmentation model to be trained; loading corresponding pre-training parameters for the natural image segmentation module; and updating parameters of the segmentation adapter, the image decoder and the spatial multi-scale information feature extraction module in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model. In the above scheme, in the process of training the medical image segmentation model, the image encoder can be frozen, only a small amount of medical segmentation image data is used for training the segmentation adapter, and the image decoder and the spatial multi-scale information feature extraction module are trained, so that the trained medical image segmentation model is obtained. Therefore, even if large-scale high-quality medical segmentation image data is lacking, a medical image segmentation model with high segmentation accuracy can be obtained through training in the above manner.

In a fourth aspect, an embodiment of the present application provides a training apparatus for a medical image segmentation model, including: the medical image segmentation module comprises a natural image segmentation module, wherein the natural image segmentation module comprises an image encoder, an image decoder and a segmentation adapter, and the segmentation adapter is embedded in the image encoder; the loading module is used for loading corresponding pre-training parameters aiming at the natural image segmentation module; and the updating module is used for updating parameters of the segmentation adapter and the image decoder in the medical image segmentation model by utilizing the medical segmentation image data to obtain a trained medical image segmentation model.

In an alternative embodiment, the medical image segmentation model further comprises: the spatial multi-scale information feature extraction module is specifically used for: and updating parameters of the segmentation adapter, the image decoder and the spatial multi-scale information feature extraction module in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

In a fifth aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory, and a bus; the processor and the memory complete communication with each other through the bus; the memory stores computer program instructions executable by the processor, the processor invoking the computer program instructions to be able to perform the medical image segmentation method according to the first aspect or the training method of the medical image segmentation model according to the second aspect.

In a sixth aspect, an embodiment of the present application provides a computer-readable storage medium storing computer program instructions that, when executed by a computer, cause the computer to perform the medical image segmentation method according to the first aspect or the training method of the medical image segmentation model according to the second aspect.

In order to make the above objects, features and advantages of the present application more comprehensible, embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a medical image segmentation method according to an embodiment of the present application;

fig. 2 is a schematic diagram of a ViT structure according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a ViT +adapter structure according to an embodiment of the present application;

fig. 4 is a schematic diagram of an SMS structure according to an embodiment of the present application;

FIG. 5 is a flowchart of a training method of a medical image segmentation model according to an embodiment of the present application;

fig. 6 is a block diagram of a medical image segmentation apparatus according to an embodiment of the present application;

FIG. 7 is a block diagram of a training device for a medical image segmentation model according to an embodiment of the present application;

Fig. 8 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.

Referring to fig. 1, fig. 1 is a flowchart of a medical image segmentation method according to an embodiment of the present application, where the medical image segmentation method may include the following steps:

step S101: and acquiring a medical image to be processed.

Step S102: inputting the medical image to be processed into a medical image segmentation model trained in advance, and obtaining a segmentation result output by the medical image segmentation model.

Specifically, in the above step S101, the medical image to be processed refers to the medical image that is currently required to be subjected to image segmentation. The medical image refers to an image used in the medical industry, and it should be noted that, the specific implementation of the medical image is not specifically limited in the embodiment of the present application, and those skilled in the art may perform appropriate adjustment according to actual situations, for example: the medical image may include a thyroultrasound image, a mammography image, a carotid angiography image, and the like.

In addition, the embodiment of the present application is not limited to the specific embodiment for acquiring the medical image to be processed, and those skilled in the art may also perform appropriate adjustment according to the actual situation. For example, a medical image of an external device to be processed may be received; or, the medical image to be processed stored locally or in the cloud can be read; alternatively, the medical image to be processed or the like may be acquired in real time.

In the step S102, the medical image segmentation model is a neural network model for performing image segmentation on the medical image to be processed, where the medical image segmentation model in the step S102 is a medical image segmentation model trained in advance.

The following describes a structure of a medical image segmentation model provided in an embodiment of the present application. The medical image segmentation model may include a natural image segmentation module (Segment Anything Model, SAM); the natural image segmentation module may include an image encoder, an image decoder, and a segmentation adapter, wherein the segmentation adapter may be embedded in the image encoder.

In the natural image segmentation module described above, the image encoder may be based on a standard visual deformer (Vision Transformer, viT), while the attention mechanism may consist of 14 x 14 size window attention and four equally spaced global attention; the image decoder is a transform-based decoder that includes dynamic predictive bits that can process mask information for different cues (text, points, boxes, etc.).

Referring to fig. 2, fig. 2 is a schematic diagram of a ViT structure according to an embodiment of the application. The ViT architecture includes Layer normalization (Layer Norm), multi-Attention header (Muti-Head Attention), and multi-Layer perceptron (Multilayer Perceptron, MLP). Wherein, layer Norm is used for carrying out Norm treatment on each Token; the MLP may include a full connectivity layer, a GELU activation function, and Dropout.

On the ViT structure shown in fig. 2, a partition adapter (adapter) may be embedded. As an implementation manner, please refer to fig. 3, fig. 3 is a schematic diagram of a ViT +adapter structure according to an embodiment of the present application. The ViT +adapter structure introduces an adapter in the Muti-Head Attention layer and the MLP layer respectively on the basis of FIG. 2.

According to the present studies it has been shown that partial parameter tuning methods are more efficient than full tuning, as they can avoid catastrophic forgetfulness and better generalize to outside-domain scenarios, especially in low data cases. Adapter is an effective tool for fine tuning large basic visual models of downstream tasks, not only in natural language processing, but also in computer vision. In view of the fact that high-quality medical data are difficult to acquire and difficult to annotate, an Adapter is introduced in the embodiment of the application.

As another implementation mode, scaling super parameters can be introduced into the MLP layer so as to achieve the purpose of better channel weight learning.

It should be noted that, the specific structure of the Adapter is not limited in particular in the embodiment of the present application, and those skilled in the art may make appropriate adjustments according to practical situations. For example, as shown in fig. 3, the Adapter may include a Down layer, a ReLU layer, and an Up layer.

Based on the ViT +adapter structure, in the process of training the medical image segmentation model, the image encoder can be trained by updating parameters of the segmentation adapter.

As an implementation manner, considering that the natural image segmentation module is a model obtained by training tens of millions of natural images, ultra-high-precision image segmentation can be achieved on natural image segmentation, so in the embodiment of the application, the natural image segmentation module can be preloaded with pre-training parameters.

In the process of training the medical image segmentation model, a backbone network (namely an image encoder in a natural image segmentation module) can be frozen, and only the Adapter part is trained. That is, during the process of training the medical image segmentation model, the parameters of the image encoder are not changed, and the parameters of the Adapter are iteratively updated until a trained medical image segmentation model is obtained.

It can be understood that, because the natural image segmentation module is preloaded with the pre-training parameters with higher precision when the natural image is segmented, the segmentation capability of the medical image segmentation model obtained by training on the basis can be improved; in addition, as only the Adapter part is required to be trained, the medical image segmentation model can be obtained with higher precision by training the medical image segmentation model by using a small amount of medical segmentation image data, so that the technical problem of lower precision of the segmentation model due to the fact that the medical data privacy and the data labeling professionality in the prior art lack of large-scale high-quality medical segmentation image data is solved.

It should be noted that, in the prior art, the natural image segmentation module generally includes an image encoder, an image decoder, and a mask decoder, but in the embodiment of the present application, since only the image encoded and image decoded portions are sampled, no dynamic mask input is required, and thus the mask decoder may be omitted.

In addition, the embodiment of the present application does not specifically limit the specific implementation of the segmentation result in step S102, and those skilled in the art may make appropriate adjustments according to the actual situation. For example, when the medical image comprises a thyroid ultrasound image, the segmentation result may comprise the location of a thyroid nodule; alternatively, when the medical image includes a carotid angiography image, the segmentation result may include the location of carotid plaque, or the like.

Further, based on the above embodiment, the step S102 may specifically include the following steps:

and 1) inputting the medical image to be processed into an image encoder to obtain an encoding result output by the image encoder.

And 2) inputting the coding result into an image decoder to obtain a segmentation result output by the image decoder.

In particular, in step 1) above, the image encoder may be based on a standard visual deformer (Vision Transformer, viT), and the attention mechanism may consist of 14 x 14 size window attention and four equally spaced global attention. As shown in fig. 3, the medical image to be processed may be input into the image encoder, and the medical image to be processed may finally obtain the encoding result output by the image encoder through Layer Norm, muti-Head Attention, adapter and MLP.

As an embodiment, the medical image to be processed may be downsampled 16 times and then used as an input to an image encoder.

In step 2) above, the image decoder is a transform-based decoder, comprising dynamic predictive heads that can process mask information for different cues (text, points, boxes, etc.). The encoding result can be input into the image decoder, and finally the segmentation result output by the image decoder can be obtained.

In the above scheme, the natural image segmentation module may be used to segment the medical image to be processed, so as to obtain a corresponding segmentation result. The natural image segmentation module has strong segmentation capability, so that the whole medical image segmentation model has a high-precision segmentation effect in the medical image.

Further, on the basis of the above embodiment, the medical image segmentation model in the above embodiment may be trained by using the following steps:

and 1) acquiring medical segmentation image data and a medical image segmentation model to be trained.

And 2) loading corresponding pre-training parameters aiming at the natural image segmentation module.

And 3) updating parameters of the segmentation adapter and the image decoder in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

Specifically, in the above step 1), the medical divided image data refers to a medical image that has been divided and labeled. It can be appreciated that, because the adapter is introduced in the embodiment of the present application, the number of the medical segmentation image data is not required to be too large, that is, only a small amount of medical segmentation image data is required to train the medical image segmentation model.

It should be noted that, in the embodiment of the present application, the specific embodiment for acquiring the medical segmentation image data is not limited in particular, and those skilled in the art may also make appropriate adjustments according to the actual situation. For example, medical segmented image data of an external device may be received; alternatively, medical segmented image data or the like stored locally or in the cloud may be read.

The medical image segmentation model refers to a neural network model for performing image segmentation on the medical image to be processed, wherein the medical image segmentation model in the step 1) refers to an untrained or untrained medical image segmentation model. It will be appreciated that the medical image segmentation model in step 1) above is identical to the medical image segmentation model in step S102 above in structure, and differs only in the model parameters thereof.

Similar to the acquisition of the above-mentioned medical segmentation image data, the embodiment of the present application is not limited to the specific implementation of the acquisition of the above-mentioned medical segmentation model to be trained, and those skilled in the art may also perform appropriate adjustment according to the actual situation. For example, a medical image segmentation model of an external device to be trained may be received; alternatively, a medical image segmentation model or the like stored locally or in the cloud may be read to be trained.

In the step 2), considering that the natural image segmentation module is a model obtained by training tens of millions of natural images, ultra-high-precision image segmentation can be realized on natural image segmentation, so that corresponding pre-training parameters can be loaded for the natural image segmentation module.

In the step 3), in the training process for the medical image segmentation model, the backbone network (i.e., the image encoder in the natural image segmentation module) may be frozen, and parameters of the segmentation adapter and the image decoder in the medical image segmentation model are updated by using the medical segmentation image data, so as to obtain the trained medical image segmentation model.

In the above scheme, the image encoder may be frozen during the training of the medical image segmentation model, and only a small amount of medical segmentation image data is used to train the segmentation adapter and simultaneously train the image decoder, thereby obtaining a trained medical image segmentation model. Therefore, even if large-scale high-quality medical segmentation image data is lacking, a medical image segmentation model with high segmentation accuracy can be obtained through training in the above manner.

Further, on the basis of the above embodiment, the medical image segmentation model may further include: spatial Multi-Scale information feature extraction module (SMS). Referring to fig. 4, fig. 4 is a schematic diagram of an SMS structure according to an embodiment of the present application, and the spatial multi-scale information feature extraction module may use a standard convolution system referenced from res net.

As one embodiment, the spatial multi-scale information feature extraction module may include three convolutions and one maximum pooling layer; then, a convolution layer of 3 multiplied by 3 is used to double the number of channels and reduce the size of the feature map; then applying a plurality of 1 x 1 convolution layers to project the feature map to the D dimension; thus, a feature pyramid { F1, F2, F3, F4} containing the D-dimensional feature map is obtained, and the resolution is 1/8, 1/16, 1/32 and 1/64 of the original map respectively; and then mapping and flattening the feature pyramid containing the D-dimensional feature map, and splicing new features to serve as feature interaction, so that the output of the spatial multi-scale information feature extraction module is obtained.

The spatial multi-scale information feature extraction module is used for transmitting multi-scale spatial information to the natural image segmentation module. The spatial multi-scale information feature extraction module can be parallel to the patch embedding layer to model local spatial context of the image so as not to change the original structure of the natural image segmentation module. Thus, the output of the spatial multi-scale information feature extraction module may be input to a subsequent segmentation adapter, which then operates with the ith image encoder module.

and 1) inputting the medical image to be processed into a spatial multi-scale information feature extraction module to obtain feature data output by the spatial multi-scale information feature extraction module.

And 2) inputting the medical image to be processed and the characteristic data into an image encoder to obtain an encoding result output by the image encoder.

And 3) inputting the coding result into an image decoder to obtain a segmentation result output by the image decoder.

Specifically, in the step 1), the spatial multi-scale information feature extraction module may sequentially include three convolutions, a 3×3 convolution layer of a maximum pooling layer, and a plurality of 1×1 convolution layers, so as to obtain a feature pyramid { F1, F2, F3, F4} including a D-dimensional feature map, where the resolution is 1/8, 1/16, 1/32, and 1/64 of the original map, respectively; and then carrying out mapping planarization on the feature pyramid containing the D-dimensional feature map, and splicing new features to serve as feature interaction, so as to obtain feature data output by the spatial multi-scale information feature extraction module.

In step 2) above, the image encoder may be based on a standard visual deformer (Vision Transformer, viT), and the attention mechanism may consist of 14 x 14 size window attention and four equally spaced global attention. As shown in fig. 3, the medical image to be processed and the feature data may be input into the image encoder, and the medical image to be processed and the feature data may be passed through Layer Norm, muti-Head Attention, adapter, and MLP, to finally obtain the encoding result output by the image encoder.

The medical image to be processed may be input to a Layer Norm in the image encoder, and the feature data may be input to an adapter in the image encoder.

In step 3) above, the image decoder is a transform-based decoder, comprising dynamic predictive heads that can process mask information for different cues (text, points, boxes, etc.). The encoding result can be input into the image decoder, and finally the segmentation result output by the image decoder can be obtained.

In the scheme, a spatial multi-scale information feature extraction module can be introduced, and before the medical image to be processed is subjected to image segmentation, multi-scale spatial information is transmitted to the backbone network based on the medical image to be processed, so that the segmentation accuracy of the medical image segmentation model obtained through training is improved.

Further, on the basis of the above embodiment, the medical image segmentation model may be trained by the following steps:

And 3) updating parameters of the segmentation adapter, the image decoder and the spatial multi-scale information feature extraction module in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

Specifically, in the step 1), similar to the above embodiment, the embodiment of the present application is not limited to the specific implementation of obtaining the above medical segmentation image data and the medical segmentation model to be trained, and those skilled in the art may also perform suitable adjustment according to the actual situation. For example, medical segmentation image data of an external device may be received and a medical image segmentation model to be trained; alternatively, the medical segmentation image data stored locally or in the cloud may be read, a medical image segmentation model to be trained, and the like.

In the step 3), in the training process for the medical image segmentation model, the backbone network (i.e., the image encoder in the natural image segmentation module) may be frozen, and parameters of the segmentation adapter, the image decoder and the spatial multi-scale information feature extraction module in the medical image segmentation model are updated by using the medical segmentation image data, so as to obtain the trained medical image segmentation model.

In the above scheme, in the process of training the medical image segmentation model, the image encoder can be frozen, only a small amount of medical segmentation image data is used for training the segmentation adapter, and the image decoder and the spatial multi-scale information feature extraction module are trained, so that the trained medical image segmentation model is obtained. Therefore, even if large-scale high-quality medical segmentation image data is lacking, a medical image segmentation model with high segmentation accuracy can be obtained through training in the above manner.

Referring to fig. 5, fig. 5 is a flowchart of a training method of a medical image segmentation model according to an embodiment of the present application, where the training method of the medical image segmentation model may include the following steps:

step S501: medical segmentation image data and a medical image segmentation model to be trained are acquired.

Step S502: corresponding pre-training parameters are loaded for the natural image segmentation module.

Step S503: and updating parameters of the segmentation adapter and the image decoder in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

Specifically, in the above step S501, the medical divided image data refers to a medical image that has been divided and labeled. It can be appreciated that, because the adapter is introduced in the embodiment of the present application, the number of the medical segmentation image data is not required to be too large, that is, only a small amount of medical segmentation image data is required to train the medical image segmentation model.

The medical image segmentation model refers to a neural network model for performing image segmentation on the medical image to be processed, and the medical image segmentation model in the step S501 refers to an untrained or untrained medical image segmentation model. As one embodiment, the medical image segmentation model may include a natural image segmentation module including an image encoder, an image decoder, and a segmentation adapter, which may be embedded in the image encoder.

In the step S502, considering that the natural image segmentation module is a model trained by tens of millions of natural images, ultra-high precision image segmentation can be achieved on the natural image segmentation, so that corresponding pre-training parameters can be loaded for the natural image segmentation module.

In the above step S503, in the process of training the medical image segmentation model, the backbone network (i.e., the image encoder in the natural image segmentation module) may be frozen, and parameters of the segmentation adapter and the image decoder in the medical image segmentation model are updated by using the medical segmentation image data, so as to obtain a trained medical image segmentation model.

Further, on the basis of the above embodiment, the medical image segmentation model may further include: the embodiment of the application introduces another training method of a medical image segmentation model, which comprises the following steps of:

Referring to fig. 6, fig. 6 is a block diagram illustrating a medical image segmentation apparatus 600 according to an embodiment of the present application, where the medical image segmentation apparatus 600 includes: a first acquiring module 601, configured to acquire a medical image to be processed; the input module 602 is configured to input the medical image to be processed into a medical image segmentation model trained in advance, so as to obtain a segmentation result output by the medical image segmentation model; the medical image segmentation model comprises a natural image segmentation module, wherein the natural image segmentation module comprises an image encoder, an image decoder and a segmentation adapter, the segmentation adapter is embedded into the image encoder, and training of the image encoder is achieved by updating parameters of the segmentation adapter.

Further, based on the above embodiment, the input module 602 is specifically configured to: inputting the medical image to be processed into the image encoder to obtain an encoding result output by the image encoder; and inputting the coding result into the image decoder to obtain the segmentation result output by the image decoder.

Further, on the basis of the above embodiment, the medical image segmentation apparatus 600 further includes: the first training module is used for training the medical image segmentation model by the following steps: acquiring medical segmentation image data and a medical image segmentation model to be trained; loading corresponding pre-training parameters for the natural image segmentation module; and updating parameters of the segmentation adapter and the image decoder in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

Further, on the basis of the above embodiment, the medical image segmentation model further includes: a spatial multi-scale information feature extraction module; the input module 602 is specifically configured to: inputting the medical image to be processed into the spatial multi-scale information feature extraction module to obtain feature data output by the spatial multi-scale information feature extraction module; inputting the medical image to be processed and the characteristic data into the image encoder to obtain an encoding result output by the image encoder; and inputting the coding result into the image decoder to obtain the segmentation result output by the image decoder.

Further, on the basis of the above embodiment, the medical image segmentation apparatus 600 further includes: the second training module is used for training the medical image segmentation model by the following steps: acquiring medical segmentation image data and a medical image segmentation model to be trained; loading corresponding pre-training parameters for the natural image segmentation module; and updating parameters of the segmentation adapter, the image decoder and the spatial multi-scale information feature extraction module in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

Referring to fig. 7, fig. 7 is a block diagram of a training apparatus for a medical image segmentation model according to an embodiment of the present application, where the training apparatus 700 for a medical image segmentation model includes: a second obtaining module 701, configured to obtain medical segmentation image data and a medical image segmentation model to be trained, where the medical image segmentation model includes a natural image segmentation module, the natural image segmentation module includes an image encoder, an image decoder, and a segmentation adapter, and the segmentation adapter is embedded in the image encoder; the loading module 702 is configured to load corresponding pre-training parameters for the natural image segmentation module; and an updating module 703, configured to update parameters of the segmentation adapter and the image decoder in the medical image segmentation model by using the medical segmentation image data, so as to obtain a trained medical image segmentation model.

Further, on the basis of the above embodiment, the medical image segmentation model further includes: the updating module 703 is specifically configured to: and updating parameters of the segmentation adapter, the image decoder and the spatial multi-scale information feature extraction module in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

Referring to fig. 8, fig. 8 is a block diagram of an electronic device according to an embodiment of the present application, where the electronic device 800 includes: at least one processor 801, at least one communication interface 802, at least one memory 803, and at least one communication bus 804. Where communication bus 804 is used to enable direct connection communication of these components, communication interface 802 is used for signaling or data communication with other node devices, and memory 803 stores machine readable instructions executable by processor 801. When the electronic device 800 is in operation, the processor 801 communicates with the memory 803 via the communication bus 804, and machine readable instructions when invoked by the processor 801 perform the medical image segmentation method or training method of a medical image segmentation model described above.

For example, the processor 801 of the embodiment of the present application reads a computer program from the memory 803 through the communication bus 804 and executes the computer program, and as one implementation, the following method may be implemented: step S101: and acquiring a medical image to be processed. Step S102: inputting the medical image to be processed into a medical image segmentation model trained in advance, and obtaining a segmentation result output by the medical image segmentation model. As another embodiment, the following method may be implemented: step S501: medical segmentation image data and a medical image segmentation model to be trained are acquired. Step S502: corresponding pre-training parameters are loaded for the natural image segmentation module. Step S503: and updating parameters of the segmentation adapter and the image decoder in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

The processor 801 includes one or more, which may be an integrated circuit chip, having signal processing capabilities. The processor 801 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a micro control unit (Micro Controller Unit, MCU), a network processor (Network Processor, NP), or other conventional processor; but may also be a special purpose processor including a Neural Network Processor (NPU), a graphics processor (Graphics Processing Unit GPU), a digital signal processor (Digital Signal Processor DSP), an application specific integrated circuit (Application Specific Integrated Circuits ASIC), a field programmable gate array (Field Programmable Gate Array FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. Also, when the processor 801 is plural, some of them may be general-purpose processors, and another may be special-purpose processors.

The Memory 803 includes one or more, which may be, but is not limited to, random access Memory (Random Access Memory, RAM), read Only Memory (ROM), programmable Read Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable programmable Read Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), and the like.

It is to be understood that the configuration shown in fig. 8 is merely illustrative, and that electronic device 800 may also include more or fewer components than those shown in fig. 8, or have a different configuration than that shown in fig. 8. The components shown in fig. 8 may be implemented in hardware, software, or a combination thereof. In the embodiment of the present application, the electronic device 800 may be, but is not limited to, a physical device such as a desktop, a notebook, a smart phone, an intelligent wearable device, a vehicle-mounted device, or a virtual device such as a virtual machine. In addition, the electronic device 800 is not necessarily a single device, and may be a combination of a plurality of devices, for example, a server cluster, or the like.

The embodiment of the application also provides a computer readable storage medium, which stores computer program instructions, and when the computer program instructions are executed by a computer, the computer is caused to execute the medical image segmentation method or the training method of the medical image segmentation model according to the embodiment of the method.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

Further, the units described as separate units may or may not be physically separate, and units displayed as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Furthermore, functional modules in various embodiments of the present application may be integrated together to form a single portion, or each module may exist alone, or two or more modules may be integrated to form a single portion.

It should be noted that the functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM) random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A medical image segmentation method, comprising:

acquiring a medical image to be processed;

inputting the medical image to be processed into a medical image segmentation model trained in advance to obtain a segmentation result output by the medical image segmentation model;

the medical image segmentation model comprises a natural image segmentation module, wherein the natural image segmentation module comprises an image encoder, an image decoder and a segmentation adapter, the segmentation adapter is embedded into the image encoder, and training of the image encoder is achieved by updating parameters of the segmentation adapter.

2. The medical image segmentation method according to claim 1, wherein the inputting the medical image to be processed into a pre-trained medical image segmentation model to obtain the segmentation result output by the medical image segmentation model comprises:

Inputting the medical image to be processed into the image encoder to obtain an encoding result output by the image encoder;

and inputting the coding result into the image decoder to obtain the segmentation result output by the image decoder.

3. The medical image segmentation method as set forth in claim 1, further comprising:

training the medical image segmentation model by using the following steps:

acquiring medical segmentation image data and a medical image segmentation model to be trained;

loading corresponding pre-training parameters for the natural image segmentation module;

and updating parameters of the segmentation adapter and the image decoder in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

4. The medical image segmentation method as set forth in claim 1, wherein the medical image segmentation model further comprises: a spatial multi-scale information feature extraction module;

inputting the medical image to be processed into a medical image segmentation model trained in advance to obtain a segmentation result output by the medical image segmentation model, wherein the method comprises the following steps:

Inputting the medical image to be processed into the spatial multi-scale information feature extraction module to obtain feature data output by the spatial multi-scale information feature extraction module;

inputting the medical image to be processed and the characteristic data into the image encoder to obtain an encoding result output by the image encoder;

5. The medical image segmentation method as set forth in claim 4, further comprising:

training the medical image segmentation model by using the following steps:

and updating parameters of the segmentation adapter, the image decoder and the spatial multi-scale information feature extraction module in the medical image segmentation model by using the medical segmentation image data to obtain a trained medical image segmentation model.

6. A method of training a medical image segmentation model, comprising:

Acquiring medical segmentation image data and a medical image segmentation model to be trained, wherein the medical image segmentation model comprises a natural image segmentation module, the natural image segmentation module comprises an image encoder, an image decoder and a segmentation adapter, and the segmentation adapter is embedded in the image encoder;

7. A medical image segmentation apparatus, comprising:

the first acquisition module is used for acquiring a medical image to be processed;

the input module is used for inputting the medical image to be processed into a medical image segmentation model trained in advance to obtain a segmentation result output by the medical image segmentation model;

8. A training device for a medical image segmentation model, comprising:

the medical image segmentation module comprises a natural image segmentation module, wherein the natural image segmentation module comprises an image encoder, an image decoder and a segmentation adapter, and the segmentation adapter is embedded in the image encoder;

the loading module is used for loading corresponding pre-training parameters aiming at the natural image segmentation module;

and the updating module is used for updating parameters of the segmentation adapter and the image decoder in the medical image segmentation model by utilizing the medical segmentation image data to obtain a trained medical image segmentation model.

9. An electronic device, comprising: a processor, a memory, and a bus;

the processor and the memory complete communication with each other through the bus;

the memory stores computer program instructions executable by the processor, the processor invoking the computer program instructions to be able to perform the medical image segmentation method according to any of claims 1-5 or the training method of the medical image segmentation model according to claim 6.

10. A computer readable storage medium storing computer program instructions which, when executed by a computer, cause the computer to perform the medical image segmentation method according to any one of claims 1-5 or the training method of the medical image segmentation model according to claim 6.