CN118196119A

CN118196119A - Model training method, target segmentation method and device

Info

Publication number: CN118196119A
Application number: CN202410374049.7A
Authority: CN
Inventors: 孙雪梅; 李崇明; 闵祥德; 冯朝燕; 张配配
Original assignee: Tongji Hospital Affiliated To Tongji Medical College Of Huazhong University Of Science & Technology
Current assignee: Tongji Hospital Affiliated To Tongji Medical College Of Huazhong University Of Science & Technology
Priority date: 2024-03-29
Filing date: 2024-03-29
Publication date: 2024-06-14

Abstract

The embodiment of the invention discloses a model training method, a target segmentation method and a device. The model training method comprises the following steps: acquiring an original model, wherein the original model comprises a first student network, a teacher network and a second student network; acquiring a first medical image acquired for a first sample object and a segmentation label obtained by marking a target in the first medical image, and acquiring a group of segmentation training samples and a group of regression training samples based on the first medical image and the segmentation label; training a first student network based on a plurality of groups of segmentation training samples to obtain a segmentation student network, and training a second student network based on a plurality of groups of regression training samples to obtain a regression student network; and adjusting network parameters in the teacher network by utilizing the segmentation student network and the regression student network to obtain a target model for segmenting the target. According to the technical scheme provided by the embodiment of the invention, the target model with stronger generalization capability can be trained under the condition that the segmentation labels are limited.

Description

Model training method, target segmentation method and device

Technical Field

The embodiment of the invention relates to the field of medical image processing, in particular to a model training method, a target segmentation method and a device.

Background

In the field of medical image processing, medical image segmentation has been an important means of assisting doctors in diagnosis. Illustratively, bladder segmentation is one application of medical image segmentation, whose purpose is to accurately segment bladder regions from complex medical images to assist doctors in diagnosing bladder disease.

With the development of deep learning technology, image segmentation based on convolutional neural network becomes a research hotspot, and particularly Unet and variants thereof are widely applied to medical image segmentation tasks due to excellent segmentation performance. It will be appreciated that the number of segmentation labels directly affects the segmentation performance of the model.

However, the segmentation labels of the medical images are difficult to obtain, and the model trained based on the sparse segmentation labels has limited generalization capability, so that the fitting phenomenon is easy to occur, and the segmentation performance of the model on a new medical image which is not marked is not high, so that the model needs to be improved.

Disclosure of Invention

The embodiment of the invention provides a model training method, a target segmentation method and a device, which are used for training a target model with stronger generalization capability under the condition of limited segmentation labels.

According to an aspect of the present invention, there is provided a model training method, which may include:

obtaining an original model, wherein the original model comprises a first student network and a teacher network for dividing targets and a second student network for regressing targets;

Acquiring a first medical image acquired for a first sample object and a segmentation label obtained by marking a target in the first medical image, and acquiring a group of segmentation training samples and a group of regression training samples based on the first medical image and the segmentation label;

Training a first student network based on a plurality of groups of segmentation training samples to obtain a segmentation student network, and training a second student network based on a plurality of groups of regression training samples to obtain a regression student network;

And adjusting network parameters in the teacher network by utilizing the segmentation student network and the regression student network to obtain a target model for segmenting the target.

According to another aspect of the present invention, there is provided a target segmentation method, which may include:

acquiring a target medical image acquired for a target object, a target model trained by the model training method according to any embodiment of the invention and a target applied in the training process of the target model;

inputting the target medical image into a target model, and obtaining a target area where the target is located in the target medical image according to an output result of the target model.

According to another aspect of the present invention, there is provided a model training apparatus, which may include:

the system comprises an original model acquisition module, a regression model acquisition module and a model analysis module, wherein the original model acquisition module is used for acquiring an original model, and the original model comprises a first student network and a teacher network for dividing a target and a second student network for regressing the target;

the training sample obtaining module is used for obtaining a first medical image acquired for a first sample object and a segmentation label obtained by marking a target in the first medical image, and obtaining a group of segmentation training samples and a group of regression training samples based on the first medical image and the segmentation label;

The regression student network obtaining module is used for training the first student network based on a plurality of groups of segmentation training samples to obtain a segmentation student network, and training the second student network based on a plurality of groups of regression training samples to obtain a regression student network;

The target model obtaining module is used for utilizing the segmentation student network and the regression student network to adjust network parameters in the teacher network so as to obtain a target model for segmenting the target.

According to another aspect of the present invention, there is provided a target segmentation apparatus, which may include:

the target acquisition module is used for acquiring a target medical image acquired for a target object, a target model obtained by training according to the model training method according to any embodiment of the invention and a target applied in the training process of the target model;

The target segmentation module is used for inputting the target medical image into the target model and obtaining a target area where the target is located in the target medical image according to an output result of the target model.

According to the technical scheme, an original model with a double-student network structure is obtained, and the original model can comprise a first student network and a teacher network for dividing targets and a second student network for returning the targets; acquiring a first medical image acquired for a first sample object and a segmentation label obtained by marking a target in the first medical image, and acquiring a group of segmentation training samples and a group of regression training samples based on the first medical image and the segmentation label; training a first student network based on a plurality of groups of segmentation training samples to obtain a segmentation student network, and training a second student network based on a plurality of groups of regression training samples to obtain a regression student network; on the basis, network parameters in the teacher network can be adjusted by utilizing the segmentation student network and the regression student network to obtain a target model for the segmentation target. According to the technical scheme, by introducing the double-student network structure, two student networks in the double-student network structure concentrate on different tasks, so that the two student networks learn different characteristics in the model training process, and therefore, the generalization capability of the model can be ensured through multi-task learning even under the condition that the segmentation labels are limited.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention, nor is it intended to be used to limit the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a model training method provided in accordance with an embodiment of the present invention;

FIG. 2 is a flow chart of another model training method provided in accordance with an embodiment of the present invention;

FIG. 3 is a flow chart of yet another model training method provided in accordance with an embodiment of the present invention;

FIG. 4 is a schematic diagram of an alternative example of a model training method provided in accordance with an embodiment of the present invention;

FIG. 5 is a flow chart of a method of object segmentation provided in accordance with an embodiment of the present invention;

FIG. 6 is a block diagram of a model training apparatus provided in accordance with an embodiment of the present invention;

FIG. 7 is a block diagram of a target segmentation apparatus according to an embodiment of the present invention;

Fig. 8 is a schematic structural diagram of an electronic device implementing a model training method or a target segmentation method according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. The cases of "target", "original", etc. are similar and will not be described in detail herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In the technical scheme of the invention, the related aspects of acquisition, collection, updating, analysis, processing, use, transmission, storage and the like of the personal information of the user accord with the rules of relevant laws and regulations, are used for legal purposes, and do not violate the popular public sequence. Necessary measures are taken for the personal information of the user, thereby preventing illegal access to the personal information data of the user and maintaining the personal information security and network security of the user.

FIG. 1 is a flow chart of a model training method provided in an embodiment of the present invention. The embodiment is applicable to the situation of model training by using the segmentation labels, and is particularly applicable to the situation of model training by using limited segmentation labels based on a double student network focusing on different tasks. The method may be performed by a model training apparatus provided by an embodiment of the present invention, where the apparatus may be implemented by software and/or hardware, and the apparatus may be integrated on an electronic device, where the electronic device may be a variety of user terminals or servers.

Referring to fig. 1, the method of the embodiment of the present invention specifically includes the following steps:

S110, acquiring an original model, wherein the original model comprises a first student network and a teacher network for dividing targets and a second student network for regressing targets.

The original model may be understood as a deep learning model requiring training, and in the embodiment of the present invention, the original model includes two student networks (i.e., a first student network and a second student network) focusing on different tasks and one teacher network focusing on dividing tasks.

Specifically, the first student network is dedicated to a segmentation task, that is, is responsible for segmenting a target, specifically, determining whether each pixel point in the first medical image is a target, where the target may be understood as content to be segmented from the medical image, and in connection with an application scenario possibly related to an embodiment of the present invention, for example, may be a bladder area, a kidney area, or a stomach area, which is not specifically limited herein.

The second student network is focused on a regression task, namely, is responsible for regression of a target, specifically, regression of a certain aspect of pixel points located in the target in the first medical image, and in connection with an application scenario possibly related to an embodiment of the present invention, the aspect may be, for example, a distance, a number or a pixel value, which may be set according to actual requirements, and is not specifically limited herein.

It can be understood that the features learned by the two-student network focusing on different tasks in the model training process are different, and the generalization capability of the model is improved through multi-task learning, so that the risk of overfitting is reduced.

In practical applications, the network structures of the two student networks and the teacher network may be identical, partially identical or completely different, which are not specifically limited herein.

S120, acquiring a first medical image acquired for a first sample object and a segmentation label obtained by marking a target in the first medical image, and acquiring a group of segmentation training samples and a group of regression training samples based on the first medical image and the segmentation label.

The first sample object is understood to be a first medical image of the subject to be acquired, whereby the objects of the two student networks are trained, in particular to be objects containing targets, so that the acquired first medical image contains targets. The first medical image may be understood as a medical image acquired for the first sample object, including a target, and may be used in connection with an application scenario according to an embodiment of the present invention, where the medical image may be, for example, an electronic computed tomography (Computed Tomography, CT) image, a magnetic resonance imaging (Magnetic Resonance Imaging, MRI) image, an ultrasound image, or the like, and is not specifically limited herein. A segmentation label is understood to be data that marks a target in a first medical image, thereby distinguishing the target from the rest of the first medical image, and in practical applications, optionally, the data may be represented by means of an image or a position, etc., which is not specifically limited herein. A first medical image and a segmentation label are acquired.

Based on the first medical image and the segmentation label, a group of segmentation training samples which can be used for training the first student network is obtained, and optionally, the first medical image and the segmentation label can be directly used as a group of segmentation training samples in combination with the application scene possibly related to the embodiment of the invention.

Based on the first medical image and the segmentation labels, a set of regression training samples which can be used for training the second student network is obtained, and optionally, the segmentation labels are transformed into the regression labels in combination with the application scenario possibly related to the embodiment of the invention, and then the first medical image and the regression labels are used as a set of segmentation training samples.

S130, training a first student network based on a plurality of groups of segmentation training samples to obtain a segmented student network, and training a second student network based on a plurality of groups of regression training samples to obtain a regression student network.

The first student network is subjected to supervised training based on a plurality of groups of segmentation training samples, so that a segmentation student network which can be used for segmenting a target is obtained, and the segmentation student network can be also called a target segmentation network. In the embodiment of the invention, optionally, binary cross entropy (Binary Cross Entropy, BCE) +dic can be applied as a loss function; of course, the remaining loss functions may also be employed, without specific limitation.

And performing supervised training on the second student network based on the multiple groups of regression training samples to obtain a regression student network which can be used for regression targets, wherein the regression student network can also be called a target regression network. In the embodiment of the invention, the Huber loss function can be optionally adopted for network training; of course, in practical applications, the remaining loss functions may also be used, which are not specifically limited herein.

And S140, utilizing the segmentation student network and the regression student network to adjust network parameters in the teacher network, and obtaining a target model for the segmentation target.

The segmented student network and the regression student network are trained in a supervised mode, namely, the segmented student network and the regression student network fully learn the characteristics of the target in the first medical image, so that the two student networks, especially the network parameters in the two student networks, can be utilized to adjust the network parameters in the teacher network, and a target model for segmenting the target is obtained, wherein the target model is the teacher network with the network parameters adjusted.

For example, the segmentation performance of the segmented student network with respect to the target and the regression performance of the regressive student network with respect to the target may be determined first, so as to screen out the target student network with better performance from the two student networks. On the basis, further, the network parameters in the teacher network can be adjusted based on the network parameters in the target student network to obtain a target model; the network parameters in the target student network can be set to be higher in weight and the other student network can be set to be lower in weight, and then the network parameters in the teacher network are adjusted together based on the network parameters in the two student networks to obtain a target model, so that the influence of the target student network on the teacher network is enhanced; etc., and are not particularly limited herein.

Further exemplary, two student networks may be treated equally, and network parameters in the teacher network may be adjusted based on the network parameters of the two to obtain the target model.

Of course, the target model may also be obtained based on the remaining modes, which are not specifically limited herein.

An optional technical solution, based on a first medical image and a segmentation label, obtains a set of regression training samples, including:

performing distance transformation on the segmentation labels to obtain distance labels, wherein the distance labels represent the distance between a pixel point located on a target in a first medical image and a first target point, and the first target point is a pixel point associated with the target in the first medical image;

the first medical image and distance label are used as a set of regression training samples.

When the regression task is to carry out regression on the distance, the distance conversion can be carried out on the split labels to obtain the distance labels. In the embodiment of the present invention, the distance label characterizes the distance between the pixel point located on the target (i.e. in the target) in the first medical image and the first target point, where the first target point may be a pixel point associated with the target in the first medical image, for example, a pixel point on a boundary of the target (i.e. a boundary of the target), or a pixel point located at a center position in each pixel point in the target, and the like, which is not specifically limited herein. In practical applications, optionally, in the case that the distance tag is an image with the same size as the first medical image, the pixel values of the pixels in the distance tag where the positions are not the positions where the targets are located may be preset values, and in the embodiment of the present invention, optionally, the preset values may be 0.

According to the technical scheme, the distance label is obtained by carrying out distance transformation on the split label, so that the regression training sample can be obtained based on the distance label, and therefore, the effective construction of the regression training sample is realized.

On the basis, optionally, the first target point is a pixel point located on the target boundary of the target in the first medical image, and the distance label characterizes the distance between the pixel point located on the target and the target boundary in the first medical image, namely characterizes the distance between the pixel point in the target and the target boundary. In practical applications, the distance may be a minimum distance or a maximum distance, where the minimum distance is exemplified by a minimum euclidean distance or a minimum mahalanobis distance, which may be set according to practical situations, and is not specifically limited herein.

According to the technical scheme, the distance between the pixel points in the target and the target boundary can be effectively regressed, and the distance is helpful for analyzing whether each pixel point in the first medical image is the target.

FIG. 2 is a flow chart of another model training method provided in an embodiment of the present invention. The present embodiment is optimized based on the above technical solutions. In this embodiment, optionally, by using the segmentation student network and the regression student network, network parameters in the teacher network are adjusted to obtain a target model for the segmentation target, including: acquiring a second medical image acquired for a second sample object; and respectively inputting the second medical images into a segmentation student network, a regression student network and a teacher network aiming at each second medical image in the acquired plurality of second medical images so as to adjust network parameters in the teacher network according to the images respectively output by the networks and obtain a target model for segmenting the target. The same or corresponding terms as those of the above embodiments are not repeated herein.

Referring to fig. 2, the method of this embodiment may specifically include the following steps:

s210, acquiring an original model, wherein the original model comprises a first student network and a teacher network for dividing targets and a second student network for regressing targets.

S220, acquiring a first medical image acquired for a first sample object and a segmentation label obtained by marking a target in the first medical image, and acquiring a group of segmentation training samples and a group of regression training samples based on the first medical image and the segmentation label.

S230, training a first student network based on a plurality of groups of segmentation training samples to obtain segmented student networks, and training a second student network based on a plurality of groups of regression training samples to obtain regression student networks.

S240, acquiring a second medical image acquired for a second sample object.

The second sample object is understood to be a second medical image of the subject to be acquired, whereby at least the object of the teacher network is trained, in particular an object containing the target, so that the acquired second medical image contains the target. The second sample object is identical in nature to the first sample object, and is not specifically limited in its nature and is named only to distinguish between the sample object applied during the unsupervised training (i.e., the second sample object) and the sample object applied during the supervised training (i.e., the first sample object). In practical applications, the plurality of first sample objects may be part of the plurality of second sample objects, or may be completely different from each other, which is not specifically limited herein.

The second medical image may be understood as a medical image acquired for the second sample object containing the target, the imaging modality of the medical image being the same as the first medical image. A second medical image is acquired. In practical applications, image acquisition may be performed on a plurality of second sample objects, respectively, to obtain a plurality of second medical images.

S250, for each second medical image in the acquired plurality of second medical images, respectively inputting the second medical images into a segmentation student network, a regression student network and a teacher network, so as to adjust network parameters in the teacher network according to the images respectively output by the networks, and obtain a target model for segmenting the target.

The second medical images are respectively input into a segmentation student network, a regression student network and a teacher network for each of the plurality of second medical images, so that the teacher network is subjected to unsupervised training according to the images respectively output by the networks (namely the segmentation student network, the regression student network and the teacher network), and particularly network parameters in the teacher network are adjusted to obtain a target model.

According to the technical scheme provided by the embodiment of the invention, the network training can be performed based on a small amount of labeled data (namely the first medical image) and a large amount of unlabeled data (namely the second medical image) by adopting a semi-supervised learning (namely the combination of the non-supervised learning and the supervised learning), so that the dependence on the labeled data is reduced, the utilization rate of the unlabeled data is improved, and the model segmentation is more robust.

FIG. 3 is a flow chart of yet another model training method provided in an embodiment of the present invention. The present embodiment is optimized based on the above technical solutions. In this embodiment, optionally, according to the images respectively output by the networks, network parameters in the teacher network are adjusted to obtain a target model for dividing the target, including: obtaining a first segmentation prediction image output by a segmentation student network, a regression prediction image output by a regression student network and a second segmentation prediction image output by a teacher network; processing the regression prediction image to obtain a third segmentation prediction image; comparing the first segmentation predicted image with the second segmentation predicted image to adjust network parameters in the segmentation student network based on the obtained first comparison result to obtain first parameters; comparing the third segmentation predicted image with the second segmentation predicted image to adjust network parameters in the regression student network based on the obtained second comparison result to obtain second parameters; and adjusting network parameters in the teacher network based on the first parameters and the second parameters to obtain a target model for dividing the target. The same or corresponding terms as those of the above embodiments are not repeated herein.

Referring to fig. 3, the method of this embodiment may specifically include the following steps:

s310, acquiring an original model, wherein the original model comprises a first student network and a teacher network for dividing targets and a second student network for regressing targets.

S320, acquiring a first medical image acquired for a first sample object and a segmentation label obtained by marking a target in the first medical image, and acquiring a group of segmentation training samples and a group of regression training samples based on the first medical image and the segmentation label.

S330, training a first student network based on a plurality of groups of segmentation training samples to obtain segmented student networks, and training a second student network based on a plurality of groups of regression training samples to obtain regression student networks.

S340, acquiring a second medical image acquired for a second sample object.

S350, for each second medical image in the acquired plurality of second medical images, respectively inputting the second medical images into the segmentation student network, the regression student network and the teacher network to obtain a first segmentation predicted image output by the segmentation student network, a regression predicted image output by the regression student network and a second segmentation predicted image output by the teacher network.

The second medical image is input into a segmentation student network to obtain a first segmentation prediction image, wherein the first segmentation prediction image is a result of respectively predicting whether each pixel point in the second medical image is a target or not by the segmentation student network, and can be also called as a first target prediction image; inputting the second medical image into a regression student network to obtain a regression prediction image, wherein the regression prediction image is a result of regression of each pixel point in the second medical image, in particular each pixel point in the target, by the regression student network; and inputting the second medical image into the teacher network to obtain a second divided predictive image, wherein the second divided predictive image is the same as the first divided predictive image, and the description is omitted herein.

S360, processing the regression prediction image to obtain a third segmentation prediction image.

In the training process of the teacher network, in order to monitor the second split predicted image output by the teacher network by using the predicted image output by the student network, the regression predicted image with a meaning different from that of the second split predicted image may be processed to obtain a third split predicted image with the same meaning as that of the second split predicted image, so as to monitor the second split predicted image by using the third split predicted image.

S370, comparing the first segmentation predicted image with the second segmentation predicted image to adjust network parameters in the segmentation student network based on the obtained first comparison result, thereby obtaining first parameters.

And comparing the first segmentation predicted image with the second segmentation predicted image, and then carrying out loss calculation based on the obtained first comparison result so as to adjust network parameters in the segmentation student network to obtain first parameters. In combination with the application scenario possibly related to the embodiment of the invention, optionally, the loss calculation can be performed by using the structural similarity index (structural similarity index, SSIM), namely, in the non-supervision training process, the first segmentation prediction image output by the segmentation student network is used as supervision information to supervise the consistency loss output by the teacher network; of course, the loss calculation may be performed in the rest of the ways, and is not specifically limited herein.

S380, comparing the third segmentation predicted image with the second segmentation predicted image to adjust network parameters in the regression student network based on the obtained second comparison result so as to obtain second parameters.

The adjustment process of the network parameters in the regression student network is similar to that of the segmentation student network, and will not be described herein.

And S390, adjusting network parameters in the teacher network based on the first parameters and the second parameters to obtain a target model for dividing the target.

Wherein the network parameters in the teacher network are adjusted based on the first parameters (i.e., the adjusted network parameters in the split student network) and the second parameters (i.e., the adjusted network parameters in the return student network). In connection with the application scenario possibly involved in the embodiment of the present invention, optionally, the adjustment implementation process of this network parameter may refer to several examples given in S140, which are not described herein again.

According to the technical scheme provided by the embodiment of the invention, in the non-supervision training process, the prediction images respectively output by the two student networks obtained by training in the supervision training process are used as supervision information, and the teacher network is supervised, so that the effective training of the teacher network is realized.

An optional technical solution, based on the first parameter and the second parameter, adjusts a network parameter in a teacher network to obtain a target model for dividing a target, including:

Based on the first parameter and the second parameter, network parameters in the teacher network are adjusted in combination with an exponential moving average (Exponential Moving Average, EMA) strategy to obtain a target model for segmenting the target.

Illustratively, the adjustment process (i.e., update process) of the network parameters in the teacher network is as follows:

Wherein θ' _t is the updated network parameter in the teacher network; alpha is a preset hyper-parameter; θ' _t-1 is a network parameter before update in the teacher network; r is a parameter which can be learned in the student network, and the importance of two student networks to the teacher network is automatically adjusted through r and 1-r, so that the weight of the student network with better teaching effect is set larger; Is a first parameter; /(I) Is the second parameter.

According to the technical scheme, the teaching effects of the two student networks on the teacher network are considered, on the basis, the reliable network parameters are selected from the two student networks with updated network parameters, the network parameters in the teacher network are weighted and adjusted by the EMA strategy, and the characteristic distillation strategy enables the teacher network to learn more robust characteristic expression under limited computing resources, so that the segmentation performance of the teacher network is improved on the premise that the network complexity and the computing load are not increased.

In another alternative solution, the regression prediction image characterizes a distance between a pixel point located on a target in the second medical image and a second target point, where the second target point is a pixel point associated with the target in the second medical image, and the processing the regression prediction image to obtain a third segmented prediction image includes:

For all pixel points in the regression prediction image, determining a first pixel point with a pixel value greater than 0 and a second pixel point with the pixel value 0 in all pixel points;

and obtaining a third segmentation prediction image according to each first pixel point and each second pixel point.

The first pixel with the pixel value greater than 0 in all the pixels is the pixel in the target, and the second pixel with the pixel value 0 is the pixel outside the target, so that the third segmentation prediction image can be obtained according to the first pixel and the second pixels.

According to the technical scheme, when the target regression is performed by using the distance, the pixel point in the target in each pixel point can be obtained according to the relation between the pixel value of each pixel point in the regression prediction image and 0, so that the conversion from the regression prediction image to the third segmentation prediction image is realized.

On this basis, in order to more visually understand the above-described respective technical aspects, an exemplary description thereof will be made below with reference to specific examples. Illustratively, as shown in fig. 4, performing supervised training on a first student network by using a first sample medical image and a segmentation label to obtain a segmented student network, specifically, processing the first sample medical image by using the first student network to obtain a first output, and then performing supervised loss calculation on the first output and the segmentation label to realize supervised training; similarly, the distance conversion is carried out on the split labels to obtain distance labels, and the first sample medical image and the distance labels are utilized to carry out supervised training on the second student network to obtain a regression student network.

Further, inputting the second sample medical image into a segmentation student network (namely a first student network with supervision training) to obtain a first segmentation prediction image; inputting the second sample medical image into a teacher network to obtain a second segmentation predicted image; and inputting the second sample medical image into a regression student network (namely, the second student network with supervision training) to obtain a regression prediction image, and processing the regression prediction image into a third segmentation prediction image.

Still further, calculating a consistency loss between the first segmented prediction image and the second segmented prediction image, thereby adjusting network parameters in the segmented student network; and calculating a consistency loss between the third segmented prediction image and the second segmented prediction image, thereby adjusting network parameters in the regression student network. Based on the above, based on the adjusted network parameters in the two student networks, the network parameters in the teacher network are adjusted by combining with the EMA strategy, so that the target model is obtained through training.

By combining the semi-supervised learning and the dual-student network structure, the problems of strong data dependence, limited generalization capability and high computing resource requirements existing in the target segmentation at present are effectively solved.

Fig. 5 is a flowchart of a target segmentation method according to an embodiment of the present invention. The present embodiment is applicable to the case of target segmentation. The method may be performed by the object segmentation apparatus provided by the embodiments of the present invention, where the apparatus may be implemented by software and/or hardware, and the apparatus may be integrated on an electronic device, where the electronic device may be various user terminals or servers.

Referring to fig. 5, the method of the embodiment of the present invention specifically includes the following steps:

S410, acquiring a target medical image acquired for a target object, a target model trained by the model training method according to any embodiment of the invention and a target applied in the training process of the target model.

The target object may be understood as an object containing a target, which is a target applied in a training process of the target model. The target medical image may be understood as a medical image obtained after image acquisition of the target object, the imaging modality of which is the same as the first medical image.

A medical image of the target, a target model, and the target are acquired.

S420, inputting the target medical image into the target model, and obtaining a target area where the target is located in the target medical image according to an output result of the target model.

The target region is understood to be the region in which the target is located in the target medical image. The target medical image is input into the target model so that the target medical image can be processed with the target model to segment a target region from the target medical image to achieve target segmentation.

According to the technical scheme provided by the embodiment of the invention, the accurate segmentation of the target is realized by utilizing the target model.

On this basis, a matching example of model training and application process is given here, specifically as follows:

1. In the model training process, window width and window level truncation is carried out on a three-dimensional (three dimensional, 3D) CT image applied as a sample medical image, and maximum and minimum normalization is carried out, namely The generalization capability of the model is enhanced, so that the model can process CT images acquired by different CT devices.

2. The data blocks of the preset size N are randomly buckled in the CT image, which can reduce the size of the CT image to save computing resources and increase the data volume and data randomness. Each data block is respectively used as a sample medical image to be input into a model for semi-supervised training. Moreover, various data enhancements may be made to the data block to improve the generalization ability of the model, such as random rotation, random translation, random brightness variation, etc.

3. In the model application process, the obtained predicted image is also normalized, then the whole predicted image is traversed on three axial positions according to the step length of N/2, the predicted result of the target model is selected as the final predicted result, and the predicted result of each pixel point is averaged to obtain a predicted probability map so as to represent the probability that each pixel point in the CT image is a target.

4. And selecting a proper probability threshold value, binarizing the predicted probability map, and removing some scattered false positive by using the connected domain in a post-processing stage to obtain a target final segmentation result.

Fig. 6 is a block diagram of a model training apparatus according to an embodiment of the present invention, which is configured to perform the model training method according to any of the foregoing embodiments. The device and the model training method of each embodiment belong to the same invention conception, and the detailed content which is not described in detail in the embodiment of the model training device can be referred to the embodiment of the model training method. Referring to fig. 6, the apparatus may specifically include: the system comprises an original model acquisition module 510, a training sample acquisition module 520, a regression student network acquisition module 530 and a target model acquisition module 540.

The original model obtaining module 510 is configured to obtain an original model, where the original model includes a first student network and a teacher network for dividing a target and a second student network for regressing the target;

the training sample obtaining module 520 is configured to obtain a first medical image acquired for a first sample object and a segmentation label obtained by marking a target in the first medical image, and obtain a set of segmentation training samples and a set of regression training samples based on the first medical image and the segmentation label;

The regression student network obtaining module 530 is configured to train the first student network based on a plurality of sets of segmentation training samples to obtain a segmented student network, and train the second student network based on a plurality of sets of regression training samples to obtain a regression student network;

The target model obtaining module 540 is configured to adjust network parameters in the teacher network by using the segmentation student network and the regression student network to obtain a target model for segmenting the target.

Optionally, the training sample obtaining module 520 may include:

The distance label obtaining unit is used for carrying out distance transformation on the segmentation labels to obtain distance labels, wherein the distance labels represent the distance between a pixel point positioned on a target in a first medical image and a first target point, and the first target point is a pixel point associated with the target in the first medical image;

And the regression training sample obtaining unit is used for taking the first medical image and the distance label as a group of regression training samples.

On the basis, optionally, the first target point is a pixel point located on the target boundary of the target in the first medical image, and the distance label characterizes the distance between the pixel point located on the target and the target boundary in the first medical image.

Optionally, the object model obtaining module 540 may include:

a second medical image acquisition sub-module for acquiring a second medical image acquired for a second sample object;

The target model obtaining sub-module is used for respectively inputting the second medical images into the segmentation student network, the regression student network and the teacher network aiming at each of the acquired plurality of second medical images so as to adjust network parameters in the teacher network according to the images respectively output by the networks and obtain a target model for segmenting the target.

On this basis, optionally, the target model obtaining sub-module may include:

the second segmentation prediction image obtaining unit is used for obtaining a first segmentation prediction image output by a segmentation student network, a regression prediction image output by a regression student network and a second segmentation prediction image output by a teacher network;

A third divided prediction image obtaining unit for processing the regression prediction image to obtain a third divided prediction image;

the first parameter obtaining unit is used for comparing the first segmentation predicted image with the second segmentation predicted image so as to adjust network parameters in the segmentation student network based on the obtained first comparison result and obtain the first parameter;

the second parameter obtaining unit is used for comparing the third segmentation predicted image with the second segmentation predicted image so as to adjust network parameters in the regression student network based on the obtained second comparison result and obtain second parameters;

The target model obtaining unit is used for adjusting network parameters in the teacher network based on the first parameters and the second parameters to obtain a target model for dividing the target.

On this basis, an optional object model obtaining unit is specifically configured to:

Based on the first parameter and the second parameter, the network parameters in the teacher network are adjusted by combining the index moving average strategy, and a target model for dividing the target is obtained.

Alternatively, the regression prediction image characterizes a distance between a pixel point located on the target in the second medical image and a second target point, the second target point being a pixel point associated with the target in the second medical image;

A third divided prediction image obtaining unit specifically configured to:

According to the model training device provided by the embodiment of the invention, an original model with a double-student network structure is obtained through an original model obtaining module, and the original model can comprise a first student network and a teacher network for dividing targets and a second student network for regressing targets; the method comprises the steps that through a training sample obtaining module, a first medical image acquired for a first sample object and a segmentation label obtained by marking a target in the first medical image are obtained, and a group of segmentation training samples and a group of regression training samples are obtained based on the first medical image and the segmentation label; training a first student network based on a plurality of groups of segmentation training samples by a regression student network obtaining module to obtain a segmentation student network, and training a second student network based on a plurality of groups of regression training samples to obtain a regression student network; on the basis, a target model obtaining module is used for obtaining a target model for dividing a target by utilizing the dividing student network and the regression student network and adjusting network parameters in the teacher network. According to the device, by introducing the double-student network structure, two student networks in the double-student network structure concentrate on different tasks, so that the two student networks learn different characteristics in the model training process, and therefore, the generalization capability of the model can be ensured through multi-task learning even if the segmentation labels are limited, and the fitting risk is reduced.

The model training device provided by the embodiment of the invention can execute the model training method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

It should be noted that, in the embodiment of the model training apparatus, each unit and module included are only divided according to the functional logic, but are not limited to the above-mentioned division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.

Fig. 7 is a block diagram of a target segmentation apparatus according to an embodiment of the present invention, where the apparatus is configured to perform the target segmentation method according to any of the foregoing embodiments. The device and the object segmentation method of each embodiment belong to the same invention conception, and the details of the embodiment of the object segmentation device, which are not described in detail, can be referred to the embodiment of the object segmentation method. Referring to fig. 7, the apparatus may specifically include: a target acquisition module 610 and a target segmentation module 620.

The target obtaining module 610 is configured to obtain a target medical image collected for a target object, a target model trained by the model training method according to any embodiment of the present invention, and a target applied in a training process of the target model;

The target segmentation module 620 is configured to input a target medical image into the target model, and obtain a target region where the target is located in the target medical image according to an output result of the target model.

According to the target segmentation device provided by the embodiment of the invention, through the mutual matching of the target acquisition module and the target segmentation module, the accurate segmentation of the target is realized by utilizing the target model.

The object segmentation device provided by the embodiment of the invention can execute the object segmentation method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

It should be noted that, in the above embodiment of the target dividing apparatus, each unit and module included are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.

Fig. 8 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 8, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.

Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as a model training method or a target segmentation method.

In some embodiments, the model training method or the object segmentation method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the model training method or the object segmentation method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the model training method or the target segmentation method in any other suitable way (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. A method of model training, comprising:

obtaining an original model, wherein the original model comprises a first student network and a teacher network for dividing a target and a second student network for regressing the target;

Acquiring a first medical image acquired for a first sample object and a segmentation label obtained by marking the target in the first medical image, and acquiring a group of segmentation training samples and a group of regression training samples based on the first medical image and the segmentation label;

Training the first student network based on a plurality of groups of the segmentation training samples to obtain a segmentation student network, and training the second student network based on a plurality of groups of the regression training samples to obtain a regression student network;

And adjusting network parameters in the teacher network by using the segmentation student network and the regression student network to obtain a target model for segmenting the target.

2. The method of claim 1, wherein the deriving a set of regression training samples based on the first medical image and the segmentation labels comprises:

performing distance transformation on the segmentation labels to obtain distance labels, wherein the distance labels represent the distance between a pixel point located on the target in the first medical image and a first target point, and the first target point is a pixel point associated with the target in the first medical image;

And taking the first medical image and the distance label as a set of regression training samples.

3. The method of claim 2, wherein the first target point is a pixel in the first medical image that is located on a target boundary of the target, and the distance tag characterizes a distance between the pixel in the first medical image that is located on the target and the target boundary.

4. The method of claim 1, wherein said adjusting network parameters in the teacher network using the segmentation student network and the regression student network to obtain a target model for segmenting the target comprises:

Acquiring a second medical image acquired for a second sample object;

And respectively inputting the second medical images into the segmentation student network, the regression student network and the teacher network for each of the acquired plurality of second medical images so as to adjust network parameters in the teacher network according to the images respectively output by the networks and obtain a target model for segmenting the target.

5. The method of claim 4, wherein adjusting network parameters in the teacher network according to the images respectively output by the networks to obtain the object model for segmenting the object comprises:

obtaining a first segmentation predicted image output by the segmentation student network, a regression predicted image output by the regression student network and a second segmentation predicted image output by the teacher network;

Processing the regression prediction image to obtain a third segmentation prediction image;

comparing the first segmentation predicted image with the second segmentation predicted image to adjust network parameters in the segmentation student network based on the obtained first comparison result to obtain first parameters;

comparing the third segmentation predicted image with the second segmentation predicted image to adjust network parameters in the regression student network based on the obtained second comparison result to obtain second parameters;

and adjusting network parameters in the teacher network based on the first parameters and the second parameters to obtain a target model for dividing the target.

6. The method of claim 5, wherein adjusting network parameters in the teacher network based on the first parameter and the second parameter results in a target model for segmenting the target, comprising:

based on the first parameter and the second parameter, in combination with an exponential moving average strategy, network parameters in the teacher network are adjusted to obtain a target model for dividing the target.

7. The method of claim 5, wherein the regression prediction image characterizes a distance between a pixel on the target in the second medical image and a second target point, the second target point being a pixel associated with the target in the second medical image;

The processing the regression prediction image to obtain a third segmentation prediction image comprises the following steps:

for all pixel points in the regression prediction image, determining a first pixel point with a pixel value greater than 0 and a second pixel point with a pixel value of 0 in all pixel points;

8. A method of object segmentation, comprising:

Acquiring a target medical image acquired for a target object, a target model trained according to the model training method of any one of claims 1-7, and a target applied in a training process of the target model;

Inputting the target medical image into the target model, and obtaining a target area where the target is located in the target medical image according to an output result of the target model.

9. A model training device, comprising:

the system comprises an original model acquisition module, a target acquisition module and a target generation module, wherein the original model is used for acquiring an original model, and comprises a first student network and a teacher network for dividing a target and a second student network for regressing the target;

The training sample obtaining module is used for obtaining a first medical image acquired for a first sample object and a segmentation label obtained by marking the target in the first medical image, and obtaining a group of segmentation training samples and a group of regression training samples based on the first medical image and the segmentation label;

A regression student network obtaining module, configured to train the first student network based on a plurality of sets of the segmentation training samples to obtain a segmented student network, and train the second student network based on a plurality of sets of the regression training samples to obtain a regression student network;

and the target model obtaining module is used for utilizing the segmentation student network and the regression student network to adjust network parameters in the teacher network so as to obtain a target model for segmenting the target.

10. An object segmentation apparatus, comprising:

The target acquisition module is used for acquiring a target medical image acquired for a target object, a target model obtained by training according to the model training method of any one of claims 1-7 and a target applied in the training process of the target model;