CN110263804B

CN110263804B - Medical image segmentation method based on safe semi-supervised clustering

Info

Publication number: CN110263804B
Application number: CN201910371366.2A
Authority: CN
Inventors: 郭丽; 甘海涛; 夏思雨; 厉振华
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2019-05-06
Filing date: 2019-05-06
Publication date: 2021-08-03
Anticipated expiration: 2039-05-06
Also published as: CN110263804A

Abstract

The invention discloses a medical image segmentation method based on safe semi-supervised clustering, and relates to a semi-supervised FCM clustering and density peak value clustering method. Firstly, a local graph is constructed by adopting a k-nearest neighbor method to obtain a graph regular term. Second, the FCM clustering and density clustering methods are used to estimate the confidence of the labeled and unlabeled samples. Then, confidence weighting of the samples and a regular term based on a local graph are introduced into the target function of the original semi-supervised FCM clustering method to obtain the target function of the safe semi-supervised clustering method. And finally, obtaining a clustering result by iteratively optimizing the membership matrix and the clustering center. The invention solves the safe use problem of the marked sample, simultaneously solves the safe use problem of the unmarked sample, and improves the accuracy and the robustness of the medical image segmentation.

Description

Medical image segmentation method based on safe semi-supervised clustering

Technical Field

The invention relates to a medical image segmentation method based on semi-supervised clustering, in particular to a medical image segmentation method based on safe semi-supervised clustering, and belongs to the field of data mining based on medical images.

Background

With the continuous development of visualization technology, modern medicine has become more and more unable to process information of medical images, and medical images play an important role in clinical diagnosis, teaching and scientific research, and the like. The medical image segmentation method based on semi-supervised clustering integrates limited manual supervision information, namely, a plurality of limited points are clicked on an image to identify the relation between corresponding regions, the points are used as sample data with label information in the medical image segmentation method based on semi-supervised clustering, and the sample data is used for guiding clustering, so that the algorithm performance is improved, and the image segmentation is more accurate. The marking in the medical image is generally finished by experts, but wrong marking may occur due to various conditions in the marking process, and the medical image often carries noise points and outliers, and the traditional medical image segmentation method based on semi-supervised clustering does not consider the two aspects in the clustering process.

In this case, the performance of the conventional semi-supervised clustering method may be worse than that of the corresponding unsupervised learning method, which limits the application of the semi-supervised clustering in the medical image segmentation to a certain extent. In other words, the marked data may be detrimental to performance, while noise and outliers in the unmarked data also have a large impact on performance. The traditional semi-supervised clustering generally considers that the prior knowledge is beneficial to learning effect, but the collected prior knowledge (such as error labeled samples and noise) can possibly cause the degradation of learning performance. Xuesong Yin indicates that wrong a priori knowledge can lead to a degradation of learning performance. Based on the two aspects, it makes sense to design a safe semi-supervised learning method. Therefore, the invention tries to develop a mechanism that different samples have different safety degrees, so as to realize that the clustering performance is not lower than that of the original unsupervised clustering and semi-supervised clustering methods.

Disclosure of Invention

The invention provides a medical image segmentation method based on safe semi-supervised clustering, aiming at the defect that the risk of a marked sample and an unmarked sample is not considered simultaneously in the traditional medical image segmentation method based on semi-supervised clustering, which can cause the final segmentation effect to be reduced.

Firstly, the invention adopts a k-nearest neighbor method to construct a local graph to obtain a graph regular term. Second, the FCM clustering and density clustering methods are used to estimate the confidence of the labeled and unlabeled samples. Then, confidence weighting of the samples and a regular term based on a local graph are introduced into the target function of the original semi-supervised FCM clustering method to obtain the target function of the safe semi-supervised clustering method. And finally, obtaining a clustering result by iteratively optimizing the membership matrix and the clustering center. The technical scheme is as follows: a medical image segmentation method based on safe semi-supervised clustering comprises the following steps:

the method comprises the following steps: inputting labeled and unlabeled medical image datasets;

step two: FCM clustering is carried out on the data set to obtain a prediction label of the data set;

step three: obtaining the confidence coefficient of the unmarked sample by using a density peak value clustering method and according to the local density of the unmarked sample and the minimum distance between the unmarked sample and the point with higher density, obtaining the confidence coefficient of the marked sample according to the local density of the marked sample in the same marked sample cluster and the minimum distance between the marked sample and the point with higher density, and normalizing the confidence coefficient;

step four: constructing a local graph with the aim of limiting the output of the labeled samples with low confidence to the output of the adjacent samples;

step five: integrating information to construct a target function;

step six: solving the optimization problem by adopting an iterative optimization method;

step seven: and judging the category of the unlabeled sample to realize medical image segmentation.

Compared with the traditional semi-supervised clustering method, the method measures the confidence coefficient of the samples by using the density and the distance between the samples, and limits the marked samples with low confidence coefficient to the output of the adjacent samples by constructing the local graph, so that each sample can be safely and reasonably used, and the clustering is more accurate and robust. The invention solves the safe use problem of the marked sample, simultaneously solves the safe use problem of the unmarked sample, and improves the accuracy and the robustness of the medical image segmentation.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention.

Detailed Description

While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

To better illustrate the objects and advantages of the present invention, an embodiment of the method of the present invention is described in further detail below with reference to fig. 1 and examples.

a subset of labeled samples of the input medical image dataset: x_l＝[x₁,...,x_l]The corresponding label is y_kE { 1.. c }, unlabeled sample subset: x_u＝[x_l+1,...,x_n]。

obtaining a predictive label of the data set through FCM clustering:

tagging predictions using the Kuhn-Munkres algorithm

Mapping to equivalent labels

With a given label

Are consistent.

local density of unlabeled sample:

wherein j ═ 1, 2.. times, n],k＝[l+1,...,n]Dist (k, j) is a point x_kAnd x_jEuclidean distance of d_cIs the truncation distance.

Minimum distance of unlabeled sample from points with higher density:

and for the data point with the greatest density:

unlabeled sample confidence: gamma ray_k＝ρ_k/δ_k (4)

Unlabeled sample confidence normalization:

local density of labeled samples in the same labeled sample cluster:

wherein j is_y＝[1,2,...q]，k′＝[1,2,...,l]，j_yRepresenting sample set and labeled sample point x_k′A set of samples with the same label.

Minimum distance of the marked sample from the point with higher density in the same marked sample cluster:

and for the data point with the greatest density:

labeled sample confidence:

labeled sample confidence normalization:

step four: constructing a k-nearest neighbor local graph with the aim of limiting labeled sample outputs with low confidence to those of neighboring samples;

constructing a local neighborhood graph of the marked sample, and then weighting W ═ W of the local graph edge_k′r]_n×nThe calculation is as follows:

wherein N is_p(x_k′) Finger x_k′P data of nearest neighbor, x_k′To mark sample points, x_rσ represents the width parameter of the gaussian kernel for the neighboring sample points.

Step five: and integrating the information to construct an objective function.

The objective function is as follows:

the limiting conditions are as follows:

by minimizing the above optimization problem, an optimal solution can be obtained. To simplify the calculation, the value of m is set to 2. The method solves the sample membership degree and the clustering center by adopting a Lagrange multiplier method.

Membership u of unlabeled samples_ik：

Wherein the content of the first and second substances,

membership u of labeled sample_ik′：

Wherein the content of the first and second substances,

cluster center v_i：

And obtaining a final membership matrix U and a clustering center V through iterative calculation. When in use

Or when the maximum iteration number is reached, the iteration is terminated, wherein t is the current iteration number, and eta is a set threshold value.

Step seven: and judging the category of the unlabeled sample to realize the segmentation of the medical image.

And after obtaining the membership matrix U, defuzzifying according to the maximum membership principle to obtain the category of the unlabeled sample, and finally, carrying out image segmentation to obtain a result.

Claims

1. A medical image segmentation method based on safe semi-supervised clustering is characterized by comprising the following steps:

a subset of labeled samples of the input medical image dataset: x_l＝[x₁,...,x_l]The corresponding label is y_kE { 1.. c }, unlabeled sample subset: x_u＝[x_l+1,...,x_n]；

obtaining a data set by FCM clusteringThe predictive label of (2):

tagging predictions using the Kuhn-Munkres algorithm

Is mapped as

Mapping tags

With a given label y_kKeeping consistent on categories;

step three: obtaining the confidence coefficient of the unlabeled sample by using a density peak value clustering method and through the local density of the unlabeled sample and the minimum distance between the unlabeled sample and a point with higher density, obtaining the confidence coefficient of the labeled sample through the local density of the labeled sample in the same labeled sample cluster and the minimum distance between the labeled sample and the point with higher density, and normalizing the confidence coefficient;

local density of unlabeled sample:

wherein j ═ 1, 2.. times, n],k＝[l+1,...,n]Dist (k, j) is a point x_kAnd x_jEuclidean distance of d_cIs a truncation distance;

minimum distance of unlabeled sample from points with higher density:

and for the data point with the greatest density:

unlabeled sample confidence: gamma ray_k＝ρ_k/δ_k (4)

Unlabeled sample confidence normalization:

local density of labeled samples in the same labeled sample cluster:

wherein j is_y＝[1,2,...q]，k′＝[1,2,...,l]，j_yRepresenting sample set and labeled sample point x_k′A set of identically labeled samples;

and for the data point with the greatest density:

labeled sample confidence:

labeled sample confidence normalization:

constructing local neighborhoods of labeled samplesIf the graph has a partial graph edge weight W [ < W >_k′r]_n×nThe calculation is as follows:

wherein N is_p(x_k′) Finger x_k′P data of nearest neighbor, x_k′To mark sample points, x_rIs a neighboring sample point, and sigma represents a width parameter of the Gaussian kernel function;

step five: integrating information to construct a target function;

the objective function is as follows:

the limiting conditions are as follows:

by minimizing the above optimization problem, an optimal solution can be obtained; to simplify the calculation, the value of m is set to 2; the method adopts a Lagrange multiplier method to solve the sample membership degree and the clustering center;

membership u of unlabeled samples_ik：

Wherein the content of the first and second substances,

membership u of labeled sample_ik′：

Wherein the content of the first and second substances,

cluster center v_i：

Obtaining a final membership matrix U and a clustering center V through iterative calculation; when in use

Or when the maximum iteration times is reached, the iteration is terminated, wherein t is the current iteration times, and eta is a set threshold;

step seven: judging the category of the unmarked sample to realize the segmentation of the medical image;