CN112700434B

CN112700434B - Medical image classification method and classification device thereof

Info

Publication number: CN112700434B
Application number: CN202110038447.8A
Authority: CN
Inventors: 丁赛赛; 左文琪
Original assignee: Guangzhou Yida Health Management Co ltd
Current assignee: Guangzhou Yida Health Management Co ltd
Priority date: 2021-01-12
Filing date: 2021-01-12
Publication date: 2024-07-19
Anticipated expiration: 2041-01-12
Also published as: CN112700434A

Abstract

The application discloses a medical image classification method and a classification device thereof, which relate to the technical field of digital image processing, wherein the method comprises the following steps: using transfer learning to transfer the deep learning model trained in the natural image data set to the field of medical images to obtain a basic model of medical image classification; constructing a deep attention branch network based on class activation mapping on a basic model of medical image classification; constructing a loss weighting module based on a gray level co-occurrence matrix on a basic model of medical image classification; establishing fusion between a loss weighting module based on a gray level co-occurrence matrix and a depth attention branch network based on class activation mapping on a basic model of medical image classification to obtain a fused medical image classification model; training the fused medical image classification model to obtain an automatic medical image classification model; and automatically classifying the images to be classified through the automatic medical image classification model. The application can be used for the accurate classification of medical images.

Description

Medical image classification method and classification device thereof

Technical Field

The invention relates to the technical field of digital image processing, in particular to a medical image classification method and a medical image classification device.

Background

Classification of medical images is of great importance for clinical diagnosis and assessment of diseases. The traditional method for manually extracting the features is complex in process, low in generalization and poor in image classification result for high similarity between normal and lesion areas. Moreover, since the lesion area typically occupies only a small portion of the image, this results in a deep convolutional neural network that lacks the ability to focus on meaningful lesions. In addition, medical image datasets often suffer from high class imbalance problems, making it difficult to obtain classification models with good sensitivity. Thus, accurate classification of medical images remains a great challenge.

Disclosure of Invention

In order to overcome the defects in the prior art, the technical problem to be solved by the embodiment of the invention is to provide a medical image classification method and a medical image classification device, which can be used for accurately classifying medical images and well solve the problems of small target identification and extremely unbalanced classification commonly existing in medical image classification tasks.

The specific technical scheme of the embodiment of the invention is as follows:

A method of classifying a medical image, the method of classifying a medical image comprising:

Using transfer learning to transfer the deep learning model trained in the natural image data set to the field of medical images to obtain a basic model of medical image classification;

constructing a deep attention branch network based on class activation mapping on a basic model of medical image classification;

Constructing a loss weighting module based on a gray level co-occurrence matrix on a basic model of medical image classification;

Establishing fusion between a loss weighting module based on a gray level co-occurrence matrix and a depth attention branch network based on class activation mapping on a basic model of medical image classification to obtain a fused medical image classification model;

Training the fused medical image classification model to obtain an automatic medical image classification model;

and automatically classifying the images to be classified through the automatic medical image classification model.

Preferably, the step of using the transfer learning to transfer the deep learning model trained in the natural image data set to the basic model of the medical image domain to obtain the medical image classification further includes: and finally adding a full-connection layer into the trained deep learning model, wherein the number of neurons of the full-connection layer is consistent with the classification category, and thus, a basic model of medical image classification is adaptively obtained.

Preferably, in constructing a class activation mapping based deep attention branching network on a basis model of medical image classification in the step, it comprises:

The class activation map has a convolution layer, a global average pool and a full connection layer as the last three layers, the high response value position in the class activation map represents a lesion area, after the global average pool, the average value of each feature map of the last convolution layer is obtained, and then the average value of the feature maps is weighted and summed by using the weight of the full connection layer to obtain the class activation map, which is specifically as follows:

where f _k (x, y) is the value of the feature map position (x, y) in the last convolutional layer, k is the number of feature maps, Is the full connection layer weight for category c, and CAM _c (x, y) represents the class activation map for category c.

Preferably, in constructing the class activation mapping based deep attention branching network on the basis model of medical image classification in step, it further comprises:

Replacing the full connection layer with a K multiplied by 1 convolution layer in the deep attention branch network, wherein K is the category number, and K multiplied by 1 represents that the convolution kernel is 1 multiplied by 1 and has K channels;

the attention branch generates attention attempts from the K category confidence graphs with the average of each feature graph on the last convolutional layer obtained after the global average pool as the confidence score for each category.

the output of the attention mechanism is derived based on the attention map and the feature map output by the feature extractor, as follows:

g′(X_i)＝(1+M(X_i))·g(X_i)

Where g (X _i) is the feature map of the feature extractor output, M (X _i) is the attention map, and g' (X _i) is the output of the attention mechanism.

Preferably, in the step of constructing a loss weighting module based on a gray level co-occurrence matrix on a basic model of medical image classification, specifically:

calculating entropy of the image by using the gray level co-occurrence matrix, wherein the entropy reflects the disorder of the image, and the entropy is specifically as follows:

Wherein P represents a gray level co-occurrence matrix, each entry in P corresponds to the number of occurrences of a pair of gray levels i and j, entropy represents the entropy of the image;

the loss weighting of benign lesions greater than the average entropy is increased based on the average entropy of malignant lesions.

Preferably, the loss weighting module based on gray level co-occurrence matrix and the depth attention branching network based on class activation mapping on the basic model of medical image classification are fused in the step to obtain a fused medical image classification model,

The deep attention branch network is trained in an end-to-end manner by adopting a loss weighting strategy guided by a gray level co-occurrence matrix, and the total loss function is as follows:

L_total(X_i)＝βL_att(X_i)+L_pre(X_i)

Where L _att(X_i) is the loss of the attention branch with input sample X _i, L _pre(X_i) is the loss of the predicted branch, L _total(X_i) is a simple weighted sum of the two losses, β represents the control attention branch loss weight, β=0.5;

The loss L _bra of each branch is calculated by combining the gray co-occurrence matrix directed loss weighting and the binary cross entropy loss, specifically as follows:

Where y _i is the label, p _i is the branch prediction output, ω _i is the gray scale co-occurrence matrix directed weight.

Preferably, in the step of training the fused medical image classification model to obtain the medical image automatic classification model, the input image is randomly cut into a preset fixed size, the data is enhanced by a random horizontal overturning and vertical overturning method, and then the network parameters in the fused medical image classification model are optimized by using random gradient descent.

A medical image classification apparatus comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, performs the steps of: a method of classifying a medical image as claimed in any one of the preceding claims.

The technical scheme of the invention has the following remarkable beneficial effects:

The application relates to a medical image classifying method and a classifying device thereof, which are image classifying technologies of a loss weighting based on a gray level co-occurrence matrix and a depth attention branch network based on class activation mapping. Therefore, the application can achieve the aim of accurately classifying the medical images and effectively solve the problems of small target identification and extremely unbalanced classification commonly existing in medical image classification tasks.

Specific embodiments of the invention are disclosed in detail below with reference to the following description and drawings, indicating the manner in which the principles of the invention may be employed. It should be understood that the embodiments of the invention are not limited in scope thereby. The embodiments of the invention include many variations, modifications and equivalents within the spirit and scope of the appended claims. Features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments in combination with or instead of the features of the other embodiments.

Drawings

The drawings described herein are for illustration purposes only and are not intended to limit the scope of the present disclosure in any way. In addition, the shapes, proportional sizes, and the like of the respective components in the drawings are merely illustrative for aiding in understanding the present invention, and are not particularly limited. Those skilled in the art with access to the teachings of the present invention can select a variety of possible shapes and scale sizes to practice the present invention as the case may be.

FIG. 1 is a flowchart of a method for classifying medical images according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a structure for constructing a class activation mapping-based deep attention branching network on a basic model of medical image classification in an embodiment of the present invention;

fig. 3 is a schematic block diagram of a method for classifying medical images according to an embodiment of the present invention.

Detailed Description

The details of the invention will be more clearly understood in conjunction with the accompanying drawings and description of specific embodiments of the invention. The specific embodiments of the invention described herein are for purposes of illustration only and are not to be construed as limiting the invention in any way. Given the teachings of the present invention, one of ordinary skill in the related art will contemplate any possible modification based on the present invention, and such should be considered to be within the scope of the present invention. It will be understood that when an element is referred to as being "disposed on" another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. The terms "mounted," "connected," "coupled," and "connected" are to be construed broadly, and may be, for example, mechanically or electrically connected, may be in communication with each other in two elements, may be directly connected, or may be indirectly connected through an intermediary, and the specific meaning of the terms may be understood by those of ordinary skill in the art in view of the specific circumstances. The terms "vertical," "horizontal," "upper," "lower," "left," "right," and the like are used herein for illustrative purposes only and are not meant to be the only embodiment.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

In order to be used for accurately classifying medical images and well solving the problems of small target recognition and extremely unbalanced categories commonly existing in medical image classification tasks, the application provides a medical image classification method, and fig. 1 is a flow chart of a medical image classification method in an embodiment of the application, as shown in fig. 1, the medical image classification method may comprise the following steps:

s101: and migrating the deep learning model trained in the natural image data set to the field of medical images by using migration learning to obtain a basic model of medical image classification.

For medical images with smaller data scale, the convergence speed of the model can be increased and better classification results can be obtained by using transfer learning. Considering that most data or tasks are relevant, model parameters (also known as modeled knowledge) of a deep learning model which is trained on a natural image dataset (ImageNet) are moved to a new model in the field of medical images by transfer learning, so that a basic model of medical image classification is directly obtained. Model parameters of a deep learning model that has been trained on a natural image dataset (ImageNet) are shared to the new model to expedite and optimize the learning efficiency of the underlying model of medical image classification from zero as in most networks.

In order to match the classification task with the model output, the invention adds a Full Connected (FC) layer directly at the end of the trained deep learning model, and the number of neurons and the classification type are kept consistent, thus adaptively obtaining the basic model of medical image classification.

S102: a Class Activation Map (CAM) based deep attention branching network (Deep Attention Branch Network, DABN) is constructed on a basic model of medical image classification.

To improve the ability of the network to focus on meaningful lesion locations in the underlying model of medical image classification, the present application utilizes CAM-based attention branches to help the network locate lesion areas. CAM is a representative visual interpretation based on responses that can use the responses of the convolutional layers and the weights of the last fully-connected FC layer to visualize the attention map of each class. The CAM has a convolutional layer, a Global Average Pool (GAP) and a full link (FC) layer as the last three layers, and the high response value locations in the CAM represent lesion areas as shown in fig. 2. After GAP, the average value of each feature map of the last convolution layer is obtained, then the average value of the feature maps is weighted and summed by using the weight of the FC layer to obtain CAM, and the class activation map can be obtained after the class activation mapping processing of the image:

where f _k (x, y) is the value of the feature map position (x, y) in the last convolutional layer, k is the number of feature maps, Is the FC layer weight for category c.

Since the CAM needs the weights of the last layer FC after training to sum the feature map, in order to get an attention map during training, the FC layer is replaced by a kx1×1 convolutional layer in DABN, K is a class number, and "kx1×1" means that the convolutional kernel is 1×1 and has K channels. As shown in fig. 3, the first kx1×1 convolution layer outputs K feature maps representing the locations of interest for each category. The second kx1 x1 convolutional layer mimics the last FC layer of the CAM in forward propagation, after GAP, an average of each feature map on the last convolutional layer is obtained, which represents the confidence score for each category. Thereafter, there are two applications of confidence scores, one is multiplication with the corresponding feature map to obtain a category confidence map, and the other is outputting category probabilities using a softmax function. Finally, the attention branch generates attention attempts from the K category confidence graphs. In order to aggregate the K class confidence graphs, the present invention uses a1×7×7 convolutional layer, through which a7×7 convolutional layer with Sigmoid function, an attention diagram for the attention mechanism is generated.

In order to be able to highlight the feature map at the peak of the attention profile while preventing degradation to zero in the low value region of the attention profile, the present application therefore uses the following formula:

g′(X_i)＝(1+M(X_i))·g(X_i)

S103: a loss weighting module based on a gray level co-occurrence matrix (GLCM) is constructed on a basis model of medical image classification.

In general, the texture of malignant lesions is complex, while the texture of benign lesions is simple. Therefore, the entropy of the image is calculated by using the gray level co-occurrence matrix, and the entropy value reflects the disorder of the image:

where P denotes GLCM, each entry in P corresponds to the number of occurrences of a pair of gray levels i and j.

For example, when the texture of an image is not uniform, the values of many GLCM elements may be very small, and the entropy value may be very large.

Common balance loss weighting strategies assign only higher loss weights to malignancy due to less training data. However, benign lesion samples with larger entropy values are easily confused with malignant lesions and therefore should also be assigned higher loss weights, which may help the network in the underlying model of medical image classification focus on these samples that are prone to classification errors. Thus, the present application calculates the average entropy of malignant lesions, and the loss weighting of benign lesions above this threshold will increase.

S104: and establishing fusion of a loss weighting module based on a gray level co-occurrence matrix (GLCM) and a deep attention branch network based on Class Activation Mapping (CAM) on a basic model of medical image classification to obtain a fused medical image classification model.

As shown in fig. 3, a Class Activation Map (CAM) based Deep Attention Branching Network (DABN) consists of three modules: prediction branching, attention branching, and attention mechanisms. The predicted branch is a baseline model such as ResNet and feature extractors and classifiers are constructed by dividing the predicted branch between particular layers. The feature extractor is used for extracting feature graphs from a plurality of convolution layers, and the classifier outputs the probability of each category. The attention branches are constructed based on Class Activation Mapping (CAM) after the feature extractor and take feature graphs as input to obtain attention patterns. Finally, the attention mechanism will take care to multiply the corresponding feature map in order to locate the lesion.

The deep attention branching network may be trained in an end-to-end fashion using a gray level co-occurrence matrix (GLCM) directed loss weighting strategy. The total loss function is as follows:

L_total(X_i)＝βL_att(X_i)+L_pre(X_i)

Where L _att(X_i) is the loss of the attention branch with input sample X _i, L _pre(X_i) is the loss of the prediction branch, and L _total(X_i) is a simple weighted sum of the two losses. Beta represents the control attention branch loss weight. Since the test sample only requires the output of the predicted branch as the final classification result, and the output of the note branch is only used to assist in supervising the note branch during model training, in order to get a more accurate predicted result, β=0.5 in the present application, and the loss L _bra of each branch is calculated by a combination of gray level co-occurrence matrix (GLCM) directed loss weighting and binary cross entropy loss, as follows:

Where y _i is the label, p _i is the branch prediction output, ω _i is the gray level co-occurrence matrix (GLCM) directed weight.

And obtaining a fused medical image classification model by the processing of the basic model of medical image classification.

S105: training the fused medical image classification model to obtain an automatic medical image classification model.

In order to train the classification network in the fused medical image classification model provided by the application, the input image can be randomly cut into a preset fixed size, for example 224×224, and data enhancement can be performed by carrying out random horizontal overturn, vertical overturn and other methods. Random gradient descent (Stochastic GRADIENT DESCENT, SGD) is then used to optimize network parameters in the fused medical image classification model. For example, the model uses 150 cycles to train, the initial learning rate can be set to 1e ^-4, the learning rate is reduced by 10 times every 30 cycles, the learning rate controls the step length when the gradient is updated, the early learning rate is larger, the aim of rapid network convergence can be achieved, the learning rate is gradually reduced in the later stage, the local optimal point is searched, and the training batch size is set to 8. After the fused medical image classification model is trained and the network parameters are set, the medical image automatic classification model can be obtained.

S106: and automatically classifying the images to be classified through the automatic medical image classification model.

In order to improve the accuracy of classifying the images to be classified, the size of the input images to be classified can be preprocessed, the center of the images to be classified is cut into the size which is the same as the fixed size preset in the step S105, and the images to be classified are input into an automatic medical image classification model for evaluation and classification, so that a final automatic prediction classification result is finally obtained.

The application also provides a device for dividing a skin mirror image, which comprises a memory and a processor, wherein the memory stores a computer program, and the computer program realizes the following steps when being executed by the processor: a method of classifying a medical image as claimed in any one of the preceding claims.

All articles and references, including patent applications and publications, disclosed herein are incorporated by reference for all purposes. The term "consisting essentially of …" describing a combination shall include the identified element, ingredient, component or step as well as other elements, ingredients, components or steps that do not substantially affect the essential novel features of the combination. The use of the terms "comprises" or "comprising" to describe combinations of elements, components, or steps herein also contemplates embodiments consisting essentially of such elements, components, or steps. By using the term "may" herein, it is intended that any attribute described as "may" be included is optional. Multiple elements, components, parts or steps can be provided by a single integrated element, component, part or step. Alternatively, a single integrated element, component, part or step may be divided into separate plural elements, components, parts or steps. The disclosure of "a" or "an" to describe an element, component, section or step is not intended to exclude other elements, components, sections or steps.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. The above embodiments are provided to illustrate the technical concept and features of the present invention and are intended to enable those skilled in the art to understand the content of the present invention and implement the same, and are not intended to limit the scope of the present invention. All equivalent changes or modifications made in accordance with the spirit of the present invention should be construed to be included in the scope of the present invention.

Claims

1. A method of classifying medical images, the method comprising:

2. The method of classifying medical images according to claim 1, wherein the step of migrating the deep learning model trained on the natural image data set to the basic model of medical image classification in the medical image domain using migration learning further comprises: and finally adding a full-connection layer into the trained deep learning model, wherein the number of neurons of the full-connection layer is consistent with the classification category, and thus, a basic model of medical image classification is adaptively obtained.

3. The method of classifying medical images according to claim 1, wherein in the step of constructing a class activation mapping based deep attention branching network on a basic model of medical image classification, it comprises:

4. A method of classifying medical images according to claim 3, wherein in the step of constructing a class activation mapping based deep attention branching network on a basis model of medical image classification, further comprising:

5. The method of classifying medical images according to claim 4, wherein in the step of constructing a class activation mapping based deep attention branching network on a basic model of medical image classification, further comprising:

g′(X_i)＝(1+M(X_i))·g(X_i)

6. The method according to claim 5, wherein in the step of constructing a gray level co-occurrence matrix-based loss weighting module on a basic model of medical image classification, the method specifically comprises:

7. The method of classifying medical images according to claim 6, wherein in the step of establishing fusion of the gray level co-occurrence matrix-based loss weighting module and the class activation map-based deep attention branching network on the basis model for classifying medical images to obtain the fused medical image classification model,

L_total(X_i)＝βL_att(X_i)+L_pre(X_i)

8. The method according to claim 7, wherein in training the fused medical image classification model to obtain the automatic medical image classification model, the input image is randomly cut into a preset fixed size, and the data enhancement is performed by a random horizontal overturn and vertical overturn method, and then the network parameters in the fused medical image classification model are optimized by using random gradient descent.

9. A medical image classification apparatus comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, performs the steps of: a method of classifying a medical image as claimed in any one of claims 1 to 8.