CN113127661B

CN113127661B - Multi-supervision medical image retrieval method and system based on cyclic query expansion

Info

Publication number: CN113127661B
Application number: CN202110376391.7A
Authority: CN
Inventors: 石宝荣; 任菲; 谭光明; 刘新宇; 刘玉东
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2021-04-06
Filing date: 2021-04-06
Publication date: 2023-09-12
Anticipated expiration: 2041-04-06
Also published as: CN113127661A

Abstract

The invention relates to a multi-supervision medical image retrieval method based on cyclic query expansion, which comprises the following steps: training a convolutional neural network by using a known medical image data set to obtain a classification model; performing triplet mining on the known medical image dataset by using the classification model, and training the classification model by using the mined triplet to obtain an image retrieval model; for a target medical image, obtaining a retrieval result from the known medical image dataset by the image retrieval model. The invention also relates to a multi-supervision medical image retrieval system based on cyclic query expansion and a data processing device. The multi-supervision medical image retrieval method adopts NM triplet mining, solves the problem that only label information is applicable or only similar label information is insufficient to meet the CBMIR high-precision requirement, and provides a RQE query expansion method, so that the medical image retrieval performance is further improved, and the information in the retrieval result is fully utilized.

Description

Multi-supervision medical image retrieval method and system based on cyclic query expansion

Technical Field

The invention relates to the technical field of computers, in particular to a multi-supervision medical image retrieval method and system based on cyclic query expansion.

Background

In most cases, when a doctor diagnoses a medical image, they need to first take a previously diagnosed medical image for reference. However, as the number of medical images increases rapidly, the task of labeling and diagnosing them becomes more and more laborious. In order to alleviate the burden on the physician, CBMIR (content-based medical image retrieval) is expected to be an effective technique for assisting computer-aided diagnosis.

Content-based medical image retrieval is the task of searching out the most similar images given a medical image query. It has mainly two phases, one is that good features describing the medical image can be extracted, and the second is that the most similar images are searched out from the database using the extracted features. The first stage determines whether CBMIR can fully understand the image and the second stage determines whether the model can accurately retrieve similar images. These two key phases determine the performance of medical image retrieval.

In the first stage, the CBMIR understanding image stage, existing methods mainly use classification methods of images, such as "Medical Image Retrieval using Deep Convolutional Neural Network" of a.qayyum et al or metric learning methods, such as "Learning Deep Representations of Medical Images using Siamese Cnns with Applicationto Content-based Image Retrieval" of y. -a.chung et al; in the second stage, many existing query expansion methods expand the query results only once.

In recent years, deep learning methods have achieved great progress and remarkable results in the fields of computer vision and image processing. This also inspires many professionals to apply the technique to medical image analysis. Nevertheless, the accuracy of current medical image retrieval techniques alone is not satisfactory for clinical practice. The shortcomings of the existing technology are mainly 3 points.

First, classification loss and similarity loss are two different losses, one favoring the inclusion of global features and one favoring the inclusion of local features, which should be combined;

second, in the current medical image retrieval technology, the method for constructing the triples is too simple and does not fully combine the characteristics of the medical images. How to construct training triples that combine features of medical images plays a key role in the accuracy of search models driven by similarity loss.

Third, the lack of post-processing of the search results, which is particularly critical in search technology, tends to increase the accuracy of the results by at least 1 point.

In summary, the related medical image retrieval technology is not too much at present, and the prior art is not mature enough, so that the clinical requirements of hospitals are difficult to meet.

Disclosure of Invention

In order to solve the problems, the invention provides a multi-supervision medical image retrieval method based on cyclic query expansion, which is characterized by comprising the following steps: training a convolutional neural network by using a known medical image data set to obtain a classification model; performing triplet mining on the known medical image dataset by using the classification model, and training the classification model by using the mined triplet to obtain an image retrieval model; for a target medical image, obtaining a retrieval result from the known medical image dataset by the image retrieval model.

bfm, in the step of training the image retrieval model, selecting a first triplet and a second triplet from the training dataset by a first selection policy and a second selection policy, respectively, and training the image retrieval model by the first triplet and the second triplet; the first selection policy is: when the kth training image retrieved from the training data set in a training period is the first negative sample retrieved in the training period, the kth-1 training image is taken as a positive sample, and the negative sample, the positive sample and the anchor sample form the first triplet; the second selection strategy is: when a first training image retrieved from the training data set in a training period is a first negative sample retrieved, selecting a nearest positive sample after the negative sample, and if the sorting distance from the positive sample to the negative sample is smaller than Q, forming the negative sample, the positive sample and an anchor sample into the second triplet; q is the minimum class number in the training data set, k and l are positive integers, and k is more than 1.

The invention relates to a multi-supervision medical image retrieval method, wherein the image retrieval model comprises a first branch network and a second branch network, and the first branch network comprises a sampling pool layer, an L2 regularization module and a Triplet Loss function L _t The second branch network comprises a full connection layer module and a cross entropy loss function L _C ，

m is the amplitude difference threshold of the negative sample-to-anchor sample distance and the positive sample-to-anchor sample distance in the triplet, q is the anchor sample characteristic, d ⁺ Representing positive sample characteristics, d ^- Representing the negative sample characteristics, M representing the total sample number of the training data set, i, j being the class number of the samples in the training data set, y _c Is the true label for each sample.

The cost function of the image retrieval model is as follows:is L _t And L _C Balance factor of (A), preferably->

The multi-supervision medical image retrieval method of the invention retrieves the target medical image by a cyclic rearrangement method, and the cyclic rearrangement method comprises the following steps: during the 1 st round of searching, searching according to the image characteristics of the target medical image, respectively obtaining N1 st round of searching results in N image databases of the known medical image data set, and constructing a 1 st round of similarity score matrix according to the similarity scores of all 1 st round of searching results relative to the target medical image; when the mth round of searching is performed, extracting image features from the m-1 th round of searching results with the maximum similarity score of h+ (m-2) delta h, averaging to obtain mth round of searching features, searching according to the mth round of searching features to obtain N mth round of searching results, and accumulating the m-1 th round of similarity score matrix according to the similarity scores of all the mth round of searching results relative to the mth round of searching features to obtain an mth round of similarity score matrix; the M-th round of similarity score matrix is subjected to similarity score sorting, and the M-th round of retrieval result corresponding to the maximum value of the similarity score is taken as a final retrieval result; wherein l, M, M, N, Δl are positive integers, M ε [2, M ], and satisfy N=l+ (M-2) Δl.

The multi-supervision medical image retrieval method comprises the steps of carrying out enhancement processing on the images of the known medical image data set by an autoautomatic method before the step of training the image retrieval model, and carrying out standardization processing on the images after the enhancement processing.

The invention also provides a multi-supervision medical image retrieval system based on cyclic query expansion, which comprises: the model training module is used for training the convolutional neural network by using a known medical image data set to obtain a classification model; performing triplet mining on the known medical image data set by using the classification model, and training the classification model by using the mined triplet to obtain an image retrieval model; an image retrieval module for obtaining a retrieval result from the known medical image dataset for a target medical image by the image retrieval model.

The invention also proposes a computer readable storage medium storing computer executable instructions, characterized in that when the computer executable instructions are executed, a multi-supervised medical image retrieval method based on cyclic query expansion as described above is implemented.

The present invention provides a data processing apparatus comprising a computer readable storage medium as described above, which when accessed and executed by a processor of the data processing apparatus, performs a cyclic query expansion based multi-supervised medical image retrieval of a target medical image.

Drawings

FIG. 1 is a flow chart of a multi-supervised medical image retrieval method based on cyclic query expansion of the present invention.

Fig. 2 is a schematic diagram of a network model structure of the multi-supervised medical image retrieval method of the present invention.

FIG. 3 is a schematic diagram of the triplet training of the present invention.

Fig. 4 is a flow chart of a cyclic rearrangement method of the present invention.

Fig. 5 is a data processing apparatus of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.

First, in order to solve the problem that the prior art does not combine the two losses of classification loss and similarity loss, the present invention proposes an image retrieval network combining the two losses. The image retrieval network includes two branch networks, one for accepting the classification loss and one for accepting the similarity loss. During training, the two losses are combined to counter-propagate.

Secondly, in order to solve the problems that the method for constructing the triples is too simple and the characteristics of the medical images are not combined in the prior art, the invention provides a NM (NeighborMining) triples mining strategy which comprises two modes of strategy A (strategyA) and strategy B (strategyB), and the triples are constructed according to the characteristics of large intra-class difference and small inter-class difference of the medical images from different angles.

Again, in order to solve the prior art lack the post-processing procedure of retrieval result or the too simple problem of post-processing procedure. The invention provides a cyclic rearrangement method-RQE (recurrent query expansion). The method further improves the medical image retrieval performance by fully utilizing the information in the sequencing result.

Experimental results prove that the image retrieval model combining the three points greatly improves the retrieval effect of the medical image CBMIR (Content Based Medical Image Retrieval) compared with the existing popular method, so that the technology meets the requirements of clinical practice.

FIG. 1 is a flow chart of a multi-supervised medical image retrieval method based on cyclic query expansion of the present invention. As shown in fig. 1, the medical image retrieval method of the present invention includes the steps of:

step S1: data preprocessing and first stage training of convolutional neural networks, including acquisition of data sets, preprocessing of images, and pre-training of the network.

Step S2: the method comprises triplet mining and network training, including formulating triplet mining strategies, constructing corresponding network structures and explaining, training required loss functions, setting network parameters in the training process and the like.

Step S3: the feature extraction and query expansion comprises the steps of carrying out feature extraction by using a trained image retrieval model, carrying out query retrieval by using the extracted features, and expanding a first query result so as to improve the final retrieval accuracy.

The present invention will be described in detail below with reference to fig. 2 by way of a specific embodiment for better explanation of the present invention.

Step S1, data preprocessing and network pre-training phases:

in recent years, diabetic retinopathy is one of the leading factors worldwide leading to blindness. The dataset used by the inventors in the experiment was from the diabetic fundus dataset in Kaggle DR challenge. It contains 35135 Zhang Yande images and 5 different labels in total (from normal/healthy to severe). In an embodiment of the invention, the data set is divided into a training data set and a validation data set in a ratio of 7:3.

First, a Convolutional Neural Network (CNN) based classification network backbone is built to train a simple classification model. In view of robustness, an autoautomatic method is applied to image processing enhancement, the processed picture is scaled to 224x224 size and then normalized, and a fixed number of preprocessed images are randomly and uniformly selected from each class for each training period (epoch) in order to solve the problem of data imbalance. Then, 100 epochs are trained using the resnet50 as a pre-training model, and the selected images are constructed as a training dataset.

Step S2, triplet mining and network training phases:

the existing medical image retrieval method only uses category information to train a network, but only uses similarity information, so that the situation that the two are combined together is not unusual. However, the losses associated with these two types of information are different and complementary. The present invention proposes that the network of fig. 2 exploits two complementary losses. The CNN backbone network part uses a pretrained resnet50 model of the last stage, which removes the average pooling layer and the full connection layer. As shown in fig. 2, the network includes two branch networks after the backbone network, the first branch network including a sum-shaping module and an L2 regularization step followed by a Triplet Loss function. The second branch network contains a full connection layer section followed by a Cross entropy (Cross entropies) loss function.

Wherein, a Triplet Loss function L _t The method comprises the following steps:

m is a margin variable, the margin variable represents a threshold value of a distance from a negative sample to an anchor sample and a distance from a positive sample to the anchor sample in the triplet, q is a characteristic of the anchor sample (anchor sample) of the triplet, and d ⁺ Features representing positive samples of triplets, d ^- Representing the characteristics of the triplet negative sample. The Triplet loss drives the learning process as shown in fig. 3: so that the distance from the positive sample P to the anchor sample A is gradually smaller than the distance from the negative sample N to the anchor sample, and the distance difference needs to be larger than a certain margin value.

Cross entropy loss L for use in a network _C The method comprises the following steps:

wherein M represents the total number of samples, y _c And (3) representing the real label of each sample of the triplet, i representing the ith class label in the training data set, and j representing the jth class label in the training data set.

Total cost function L _total The method comprises the following steps:

wherein,,is the balance factor of two losses, proved by multiple experimental results, the +.>Preferably 0.1.

Before introducing the NM (NeighborMining) triplet mining strategy of the present invention, the characteristics of the triplet mining and medical images are first analyzed. In triplet mining, neither too simple pairs nor too difficult pairs can be selected, as simple triplets have limited contribution to training, resulting in slow convergence, while too difficult pairs can cause the model to arrive too early at a bad local minimum, resulting in model collapse. In order to achieve excellent performance in medical image retrieval, the triplet mining strategy needs to consider the characteristics of the medical image. Most medical images, such as diabetic fundus images, are more likely to have a larger distance between one type of sample than natural images, and a smaller distance between different samples, many nuances between images of different labels being difficult for the naked eye to identify. To solve these problems, the present invention proposes two strategies:

StrategyA: in any training period, selecting a sample with a first category different from a query sample in the search result as a negative sample, and recording a subscript as k; if the sample is not arranged in the first bit (k.noteq.1), the k-1 sample is taken as a positive sample, and the query sample (as an anchor sample) and the negative sample and the positive sample form a triple pair (triplet) of half-hard.

Strategy B: in any training period, selecting a sample with a first category different from the query sample in the search result as a negative sample, selecting a first positive sample behind the sample, and if the sorting distance between the positive sample and the negative sample is smaller than the searching range Q (Q is the category number of the least category in the training data set), forming a triple pair (triplet) of the query sample, the negative sample and the positive sample.

The query sample is a sample divided in the verification data set, and when other pictures similar to a certain picture need to be searched in the process of training the image search model, the picture is called a query sample.

Note that policy a selects two adjacent sorted samples as elements in the triplet, and policy B selects two samples that are no farther than Q as elements in the triplet. The positive and negative samples in these two strategies are relatively close in query ranking, indicating that less variance is emphasized when training. In this way, the model can be slowly changed to correct the instance of misclassification, which is well adapted to the characteristic that differences in different classes of partial medical images are difficult to distinguish with the naked eye. For each epoch, performing triplet mining on the training data set by using strategyA and strategyB simultaneously, wherein the obtained triples are used for network training of the epoch to obtain an image retrieval model.

Step S3, a feature extraction and query expansion stage:

when the target image is searched through the image search model, the conv4 output of the resnet50 model is extracted and selected from the characteristics of the target image, and the result after the regularization of the sampling and L2 is used as the image characteristics of the target image. The similarity between images is measured by the feature dot product. In order to improve the accuracy of image retrieval, the invention provides a RQE cyclic rearrangement method, as shown in FIG. 4, which specifically comprises the following steps:

given a list Q, each element in Q represents a range of cyclic rearrangements, which range is sequentially incremented. Defining initial similarity matrix as similarity score matrix obtained by first inquiry, traversing whole list Q, ith traversal, selecting previous Q [ I ] characteristic average of last sorting result, making new inquiry, and adding similarity score matrix calculated by inquiry result on initial similarity score matrix. And after the traversal is finished, sequencing by using the final similarity score matrix to obtain a final retrieval return result.

Fig. 4 is a flow chart of a cyclic rearrangement method of the present invention. As shown in fig. 4, in an embodiment of the present invention, the method specifically includes:

step S31, carrying out the 1 st round of retrieval, carrying out the retrieval according to the image characteristics of the target medical image, and respectively obtaining N1 st round of retrieval results in N image databases of the known medical image data set;

step S32, constructing a 1 st round similarity score matrix by using the similarity scores of all 1 st round retrieval results relative to the target medical image;

step S33, carrying out mth round search, extracting image features from the m-1 th round search results with the maximum similarity score of h+ (m-2) delta h, and averaging to obtain mth round search features;

step S34, searching according to the mth round of searching characteristics to obtain N mth round of searching results;

step S35, accumulating the m-1 th round of similarity score matrix by using the similarity scores of all m-th round of search results relative to the m-th round of search features to obtain an m-th round of similarity score matrix;

step S36, judging whether the m-1 th round of search results which are not listed in the rearrangement range exist, if so, performing step S37, and if all the m-1 th round of search results have been subjected to image feature extraction, performing step S38;

step S37, increasing the range of the last round of search results according to the rule of h+ (m-2) delta h; and proceeds to step S33;

step S38, carrying out similarity score sequencing on an M-th round (last round) similarity score matrix, and taking an M-th round retrieval result corresponding to the maximum value of the similarity score as a final retrieval result;

wherein h, M, M, N, Δh are positive integers, M ε [2, M ], and satisfy N=h+ (M-2) Δh.

The pseudo code of the method is as follows:

the Rank array generated according to the algorithm can be used as a final query result.

The multi-supervision medical image retrieval method disclosed by the invention is disclosed above, and is a specific embodiment for retrieving by taking the diabetic fundus data set as an original medical image data set, but can also be applied to query retrieval of other medical images, such as tumor medical image retrieval and the like, and the invention is not limited to the above.

The invention also proposes a computer readable storage medium, and a data processing apparatus, as shown in fig. 5. The computer readable storage medium of the present invention stores computer executable instructions that, when executed by a processor of a data processing apparatus, implement the above-described multi-supervised medical image retrieval method based on cyclic query expansion. Those of ordinary skill in the art will appreciate that all or a portion of the steps of the above-described methods may be performed by a program that instructs associated hardware (e.g., processor, FPGA, ASIC, etc.), which may be stored on a readable storage medium such as read only memory, magnetic or optical disk, etc. All or part of the steps of the embodiments described above may also be implemented using one or more integrated circuits. Accordingly, each module in the above embodiments may be implemented in the form of hardware, for example, by an integrated circuit, or may be implemented in the form of a software functional module, for example, by a processor executing a program/instruction stored in a memory to implement its corresponding function. Embodiments of the invention are not limited to any specific form of combination of hardware and software.

Compared with the prior art, the invention has the following beneficial effects:

(1) Aiming at the characteristics of medical images, a new triplet mining method is provided: NM (NeighborMining). The triple loss and the cross entropy loss are combined in the field of medical image retrieval for the first time, and the problem that only the label information is applicable or only the similar label information is insufficient to meet the CBMIR (Content Based Medical Image Retrieval) high-precision requirement is solved;

(2) A new query expansion method, RQE, is provided, which further improves the performance of medical image retrieval and fully utilizes the information in the retrieval result.

(3) Compared with other popular methods, the method of the invention achieves a competitive result on the data set, and the model can better assist doctors in medical diagnosis work.

Although the present invention has been described with reference to the above embodiments, it should be understood that the invention is not limited thereto, but may be modified and changed by those skilled in the art without departing from the spirit and scope of the present invention.

Claims

1. A multi-supervised medical image retrieval method based on cyclic query expansion, comprising:

training a convolutional neural network by using a known medical image data set to obtain a classification model; performing triplet mining on the known medical image data set by using the classification model, and taking the mined triplet as a training data set; selecting a first triplet and a second triplet from the training data set by a first selection strategy and a second selection strategy respectively, and training the classification model by the first triplet and the second triplet to obtain an image retrieval model; wherein the first selection policy is: when the kth training image retrieved from the training data set in a training period is the first negative sample retrieved in the training period, the kth-1 training image is taken as a positive sample, and the negative sample, the positive sample and the anchor sample form the first triplet; the second selection strategy is: when a first training image retrieved from the training data set in a training period is a first negative sample retrieved, selecting a nearest positive sample after the negative sample, and if the sorting distance from the positive sample to the negative sample is smaller than Q, forming the negative sample, the positive sample and an anchor sample into the second triplet; q is the class number of the least class of the training data set, k and l are positive integers, and k is more than 1;

for a target medical image, obtaining a retrieval result from the known medical image dataset by the image retrieval model.

2. The multi-supervised medical image retrieval method as recited in claim 1, wherein the image retrieval model includes a first branch network and a second branch network, the first branch network including a sampling pooling layer, an L2 regularization module, and a Triplet Loss function L _t The second branch network comprises a full connection layer module and a cross entropy loss function L _C ，

3. The multi-supervised medical image retrieval method as recited in claim 2, wherein the cost function of the image retrieval model is:

lambda is L _t And L _C Is a balance factor of (a).

4. A multi-supervised medical image retrieval method as claimed in claim 3, wherein λ = 0.1.

5. The multi-supervised medical image retrieval method as recited in claim 1, wherein the target medical image is retrieved by a cyclic rearrangement method comprising:

during the 1 st round of searching, searching according to the image characteristics of the target medical image, respectively obtaining N1 st round of searching results in N image databases of the known medical image data set, and constructing a 1 st round of similarity score matrix according to the similarity scores of all 1 st round of searching results relative to the target medical image;

when the mth round of searching is performed, extracting image features from the m-1 th round of searching results with the maximum similarity score of h+ (m-2) delta h, averaging to obtain mth round of searching features, searching according to the mth round of searching features to obtain N mth round of searching results, and accumulating the m-1 th round of similarity score matrix according to the similarity scores of all the mth round of searching results relative to the mth round of searching features to obtain an mth round of similarity score matrix;

the M-th round of similarity score matrix is subjected to similarity score sorting, and the M-th round of retrieval result corresponding to the maximum value of the similarity score is taken as a final retrieval result;

6. The multi-supervised medical image retrieval method as recited in claim 1, wherein the image of the known medical image dataset is enhanced by an autoaugmentation method and the enhanced image is normalized prior to the step of training the image retrieval model.

7. A multi-supervised medical image retrieval system based on cyclic query expansion, comprising:

the model training module is used for training the convolutional neural network by using a known medical image data set to obtain a classification model; performing triplet mining on the known medical image data set by using the classification model, and taking the mined triplet as a training data set; selecting a first triplet and a second triplet from the training data set by a first selection strategy and a second selection strategy respectively, and training the classification model by the first triplet and the second triplet to obtain an image retrieval model; wherein the first selection policy is: when the kth training image retrieved from the training data set in a training period is the first negative sample retrieved in the training period, the kth-1 training image is taken as a positive sample, and the negative sample, the positive sample and the anchor sample form the first triplet; the second selection strategy is: when a first training image retrieved from the training data set in a training period is a first negative sample retrieved, selecting a nearest positive sample after the negative sample, and if the sorting distance from the positive sample to the negative sample is smaller than Q, forming the negative sample, the positive sample and an anchor sample into the second triplet; q is the class number of the least class of the training data set, k and l are positive integers, and k is more than 1;

an image retrieval module for obtaining a retrieval result from the known medical image dataset for a target medical image by the image retrieval model.

8. A computer readable storage medium storing computer executable instructions which, when executed, implement the cyclic query expansion based multi-supervised medical image retrieval method of any of claims 1-6.

9. A data processing apparatus comprising the computer readable storage medium of claim 8, which when accessed and executed by a processor of the data processing apparatus, performs a loop query expansion-based multi-supervised medical image retrieval on a target medical image.