CN112329867A

CN112329867A - MRI image classification method based on task-driven hierarchical attention network

Info

Publication number: CN112329867A
Application number: CN202011245707.0A
Authority: CN
Inventors: 张哲昊; 高琳琳; 金光; 郭立君; 张�荣
Original assignee: Ningbo University
Current assignee: ZHEJIANG DEERDA MEDICAL TECHNOLOGY Co.,Ltd.
Priority date: 2020-11-10
Filing date: 2020-11-10
Publication date: 2021-02-05
Anticipated expiration: 2040-11-10
Also published as: CN112329867B

Abstract

The invention relates to a task-driven hierarchical attention network-based MRI image classification method, which comprises the following steps of firstly, constructing a training set, a verification set and a test set; secondly, training the constructed information sub-network by using a training set and obtaining the information sub-network with the optimal network parameters through a verification set; then, training the constructed hierarchical attention sub-network by using a training set and obtaining the hierarchical attention sub-network with the optimal network parameters through a verification set; and finally, randomly selecting an MRI image in the test set, inputting the image into the information subnetwork to obtain an information map, then inputting the information map and the MRI image to be classified into a trained hierarchical attention subnetwork to obtain the probability of the MRI image to be classified belonging to each class, and taking the class corresponding to the value with the highest probability as the class of the MRI image to be classified. The method not only can locate the region related to classification, but also has excellent classification performance.

Description

MRI image classification method based on task-driven hierarchical attention network

Technical Field

The invention relates to an MRI image classification method based on a task-driven hierarchical attention network.

Background

Convolutional Neural Networks (CNN) are widely used in image classification tasks due to their excellent feature extraction capability. The network can directly take the image as input and automatically extract the characteristics of the image such as color, texture and the like, thereby avoiding the complex characteristic extraction and model construction process in the traditional recognition algorithm.

Since Magnetic Resonance Imaging (MRI) images are large, the accuracy of classifying natural images by directly using CNN structures is often not high. Therefore, many scholars improve the CNN structure of natural image classification, and the existing MRI image classification method based on CNN can be divided into three categories: region-of-interest (ROI) based classification method, image block based classification method and full image based classification method.

The classification method based on ROI firstly pre-divides the related region of the original image according to the domain knowledge of experts and extracts the features, and then constructs the MRI image classifier, but the method can not cover all the classification related regions of the whole MRI image generally, and simultaneously needs complex preprocessing steps. Image block-based classification methods typically segment an entire MRI image into a plurality of image blocks, then extract features from the image blocks, and finally simply fuse the image block features for classifying the sample. Image block-based methods can better extract local features of an image and do not require domain knowledge. However, these methods simply use the last layers of CNN to fuse the features of the image block, which may result in some potential loss of information in the entire image. The full-map based classification methods extract features for the entire MRI image, they can obtain global features and do not require expert knowledge. However, since MRI images are large and regions associated with classification are small, such methods fail to accurately locate these regions resulting in poor final classification results. At present, an attention mechanism is usually adopted in the more advanced classification method based on the whole graph, namely, the network is enabled to learn the weight of each area of the whole graph, and the classification accuracy is improved through a weighting method.

In summary, the ROI-based classification method and the image block-based classification method focus on extracting local features having discriminant properties using different strategies, while the whole image-based method focuses on extracting semantic features of the whole image. However, the first two methods often neglect to mine the features of the whole image, and the last method does not sufficiently perform the extraction of the local region features related to classification. In summary, the first two methods described above focus on extracting local features of an image, while the last method focuses on extracting global features of an image.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide an MRI image classification method based on task-driven hierarchical attention network, which can not only locate the region related to classification, but also have superior classification performance, in view of the above-mentioned prior art.

The technical scheme adopted by the invention for solving the technical problems is as follows: a MRI image classification method based on a task-driven hierarchical attention network is characterized by comprising the following steps:

step 1, acquiring a certain number of MRI images with known types, preprocessing all the MRI images with known types, and then normalizing all the preprocessed MRI images into a uniform size to form a sample set;

step 2, dividing the sample set into a training set, a verification set and a test set;

step 3, constructing an information subnetwork, and training the constructed information subnetwork by using samples in a training set to obtain the trained information subnetwork;

the constructed information sub-network comprises an information graph extraction network and an image block classification network, wherein the information graph extraction network is used for classification, a convolution layer with a convolution kernel of 1 x 1 replaces a 3D CNN of a global average pooling layer and a full connection layer, and the information graph with the number of network output channels of 1 is extracted through the information graph; the image block classification network is 3D CNN for classification;

the specific process of the information subnetwork training is as follows:

step 3-1, initializing an information graph, and extracting network parameters in a network and an image block classification network;

3-2, randomly selecting R MRI images in the training set, and inputting the selected R MRI images into an initialized information graph extraction network to obtain an information graph with the number of channels being 1 corresponding to each MRI image, wherein R is a positive integer;

3-3, respectively extracting K highest values in each information graph, and selecting K image blocks with the size of L W H in the MRI image corresponding to the K highest values in each information graph according to the mapping relation of the 3D CNN; wherein K, L, W and H are both positive integers;

step 3-4, inputting K image blocks with the size of L, W and H selected from each MRI image into an initialized image block classification network to obtain classification probabilities of the K image blocks selected from each MRI image belonging to each category, and calculating the confidence coefficient of each image block selected from each MRI image according to the classification probabilities of each category;

3-5, calculating a total loss function L, and reversely updating network parameters in the initialized information map extraction network and the initialized image block classification network according to the total loss function L to respectively obtain an updated information map extraction network and an image block classification network, so as to obtain an information subnetwork after one training;

L＝L_p-cls+L_cal

wherein L is_p-clsIn order to be a function of the first loss,

y_r,k,jis an indicator variable,

y

_r,k,j0 or 1, when the kth image block selected from the r-th MRI imagej categories are the same as the category of the MRI image, y_r,k,jIf not, then y_r,k,j＝0；c_r,k,jThe classification probability of the kth image block selected from the r MRI images belonging to the jth category; n is the total number of categories; l is_calIn order to be a function of the second loss,

s_r,kis the kth highest value, d, in the r information map_r,kThe confidence coefficient of the k image block selected from the r MRI image;

3-6, repeating the steps 3-2 to 3-5, sequentially selecting different MRI images from the training set, inputting the selected multiple MRI images into the information sub-network after one-time training, and continuously updating parameters in the information sub-network to obtain the trained information sub-network;

step 4, sending the MRI images in the verification set into the information sub-network after the training in the step 3, and screening and storing the information sub-network with the optimal network parameters;

step 5, constructing a hierarchical attention subnetwork, and training the constructed hierarchical attention subnetwork by using a sample in a training set to obtain a trained hierarchical attention subnetwork;

the constructed hierarchical attention subnetwork is a network structure taking 3D CNN as a backbone, in addition, an attention module is also arranged behind a convolution module of the 3D CNN, the attention module takes an information graph as input, and a calculation formula of an output F' of the attention module is as follows:

f '═ M' < '> F, wherein M' ═ TI (M)

F is a feature diagram output by a certain convolution module in the 3D CNN, M is an information diagram, and TI () is a trilinear interpolation function, so that the matrix space size of M' is the same as that of F; an as dot product operation;

f' obtained by the calculation formula is used as the input of the next layer of the convolution module of the output characteristic diagram F in the 3D CNN;

the specific process of the hierarchical attention subnetwork training comprises the following steps:

step 5-1, initializing network parameters in a hierarchical attention sub-network;

step 5-2, obtaining Q information images by passing the Q MRI images in the training set through the information subnetwork stored in the step 4, taking the Q information images as the input of the attention module, and simultaneously inputting the Q MRI images into the hierarchical attention subnetwork provided with the attention module, wherein Q is a positive integer;

step 5-3, calculating a loss function L of the hierarchical attention sub-network_a-clsUpdating network parameters in the hierarchical attention subnetwork according to the loss function to obtain a once updated hierarchical attention subnetwork;

wherein x is_q,jIs an indicator variable, x_q,jWhen the jth category of the qth MRI image is the same as the category of the MRI image, x is 0 or 1_q,jIf not, then x_q,j＝0；I_qFor the q MRI images input into the hierarchical attention sub-network, M_qFor MRI images I_qInformation map, c 'obtained after passing through information subnetwork'_q,jFor the q-th MRI image I_qAnd information map M_qA probability of being predicted as a jth category into the hierarchical attention subnetwork;

step 5-4, selecting different MRI images from the training set in sequence, repeating the steps 5-2 to 5-3, continuously updating parameters in the level attention sub-network, and obtaining the level attention sub-network after training;

step 6, sending the MRI images with concentrated verification into the hierarchical attention sub-network after the training in the step 5, and screening and storing the hierarchical attention sub-network with the optimal network parameters;

and 7, acquiring the categories of the MRI images to be classified: the specific process is as follows: and (3) randomly selecting one MRI image in the test set, recording the selected MRI image as an MRI image I 'to be classified, inputting the MRI image I' to be classified into the information subnetwork obtained in the step (4) to obtain an information map M ', then inputting the information map M' and the MRI image I 'to be classified into the hierarchical attention subnetwork obtained in the step (6) to obtain the probability that the MRI image I' to be classified belongs to each class, and taking the class corresponding to the value with the highest probability as the class of the MRI image to be classified.

Preferably, the information map extraction network constructed in step 3 is a 3D ResNet18 with global average pooling and full connection layers removed, and a convolutional layer with a three-layer convolutional kernel of 1 × 1 is added, and the constructed image block classification network is 3D ResNet 10.

Further, the confidence of each image block in step 3-4 is calculated as follows: and respectively carrying out one-hot coding on each category to which each image block belongs, and taking the classification probability corresponding to the category coded as 1 as the confidence coefficient of the image block.

Preferably, the hierarchical attention sub-network constructed in step 5 is a network with 3D ResNet34 as a backbone.

Compared with the prior art, the invention has the advantages that: according to the method, firstly, an information sub-network is used for obtaining an information graph which contains the importance degree of each area in an original graph to classification, then the information graph is used for extracting the characteristics of the important areas by a hierarchical attention sub-network reinforcing network, and through the mode of firstly positioning and then reinforcing the local characteristics, the attention of a CNN network to the relevant areas of classification is remarkably improved, the characteristics of the whole graph are effectively combined, and the classification accuracy is improved.

Drawings

FIG. 1 is a schematic block diagram of an information subnetwork in an embodiment of the present invention;

FIG. 2 is a schematic block diagram of a hierarchical attention subnetwork in an embodiment of the invention.

Detailed Description

The invention is described in further detail below with reference to the accompanying examples.

A MRI image classification method based on a task-driven hierarchical attention network comprises the following steps:

the specific process of information subnetwork training is as follows:

the confidence coefficient of each image block is calculated by the following method: respectively carrying out one-hot coding on each category to which each image block belongs, and taking the classification probability corresponding to the category coded as 1 as the confidence coefficient of the image block;

L＝L_p-cls+L_cal

wherein L is_p-clsIn order to be a function of the first loss,

y_r,k,jis an indicator variable, y_r,k,jWhen the jth category of the kth image block selected from the r-th MRI image is the same as the real category of the MRI image, y is 0 or 1_r,k,jIf not, then y_r,k,j＝0；c_r,k,jThe classification probability of the kth image block selected from the r MRI images belonging to the jth category; n is the total number of categories; l is_calIn order to be a function of the second loss,

f '═ M' < '> F, wherein M' ═ TI (M)

f' obtained through the calculation formula is used as the input of the next layer of network adjacent to the convolution module of the output characteristic diagram F in the 3D CNN;

the convolution module comprises a plurality of convolution layers, and an activation function is arranged behind each convolution layer;

the specific process of the training of the hierarchical attention subnetwork is as follows:

step 5-3, calculating a loss function L of the hierarchical attention sub-network_a-clsReversely updating network parameters in the hierarchical attention subnetwork according to the loss function to obtain a once updated hierarchical attention subnetwork;

wherein x is_q,jIs an indicator variable, x_q,jWhen the jth category of the qth MRI image is the same as the category of the MRI image, x is 0 or 1_q,jIf not, then x_q,j＝0；I_qFor the q MRI images input into the hierarchical attention sub-network, M_qFor MRI images I_qThe information graph obtained after passing through the information sub-network,c′_q,jfor the q-th MRI image I_qAnd information map M_qA probability of being predicted as a jth category into the hierarchical attention subnetwork;

In order to better explain the function of the MRI image classification method based on task-driven hierarchical attention network proposed by the present invention, in this embodiment, the method is applied to determine whether a certain MRI image includes an image feature with alzheimer's disease, wherein, as shown in fig. 1, specific use parameters and specific methods in this embodiment are as follows: performing preprocessing such as affine registration on original brain MRI image data in step 1, wherein the preprocessing is completed through FMRIB Software Library 5.0, all images are unified to the same size (128 x 128) in a mode of trilinear interpolation and filling 0 value, and the number of channels is set to be 1; in the step 2, the optimal proportion of the training set, the verification set and the test set is 7:2:1, and the training set and the verification set are required to contain the total number of MRI image categories; the information map extraction network constructed in step 3 is a 3D ResNet18 with global average pooling and full connection layers removed, and a convolutional layer with three layers of convolutional kernels 1 × 1 is added, the constructed image block classification network is 3D ResNet10, and the specific network structure is shown in table 1:

in addition, the value K used in the classification of the MRI images in step 3-4 is 4, and the size L × W × H is 48 × 48; inputting the 4 image blocks of 48 × 48 obtained in the above step 3-4 into the image block classification network, so as to obtain the classification probability of each image block, as shown in fig. 1, in this embodiment, the categories to which the image blocks belong are divided into two categories, that is, the first category includes MRI images with alzheimer's disease characteristic images and the second category does not include MRI images with alzheimer's disease characteristic images, the category to which each image block selected from each MRI image belongs is respectively subjected to one-hot encoding, and c in fig. 1 is a unit of one-hot encoding, and the classification probability of each image block is obtained by performing one-hot encoding on the category to which each image block belongs_r，1,1I.e. the classification probability that the first image block selected from the r-th MRI image belongs to the first class, c_r，1,2I.e. the classification probability that the first image block selected from the r MRI images belongs to the second class, c_r,K,1I.e. the classification probability that the Kth image block selected from the r MRI images belongs to the first class, c_r,K,2The classification probability of the Kth image block selected from the r MRI images belonging to the second class is obtained, and the first loss function L is calculated according to the classification probability and the real label of the r MRI images_p-clsIn addition, the classification probability corresponding to the category coded as 1 in the K image blocks selected from the r-th MRI image is used as the confidence of the image block, d in fig. 1_r,1I.e. the confidence of the 1 st image block selected from the r MRI images, d_r,KThat is, the confidence of the K-th image block selected from the r-th MRI image is obtained, i.e., the second loss function L can be calculated according to the confidence of the K image blocks selected from each MRI image and the K highest values in the information graph corresponding to each MRI image_cal(ii) a As shown in fig. 2, the hierarchical attention subnetwork constructed in step 5 is a network using 3D ResNet34 as a backbone, and specific parameters thereof are shown in table 1, where all of module 1, module 2, module 3, and module 4 are convolution modules, the network includes four attention modules, the four attention modules are respectively connected between module 1 and module 2, between module 2 and module 3, between module 3 and module 4, and between module 4 and a global tie pooling layer, and network parameters in the hierarchical attention subnetwork are further set to be more specificThe new method employs methods commonly used in existing 3D CNNs.

Table 1 concrete structure of each network

Compared with the method based on ROI and image blocks, the method of the invention has the advantages that: not only the characteristic extraction of local lesion areas is considered, but also the overall structure characteristic of the whole image is combined, so that the classification capability is further improved; in addition, compared with the CNN method based on the full graph, the method of the invention has the advantages that: the invention adopts an attention mechanism, so that the network focuses more on extracting local features and global information is not lost; in addition, in the invention, the attention block is used from a shallow layer to a deep layer and is always related to a classification region, while a general spatial attention network generally generates an attention map from the network itself, and due to the disappearance of the gradient, the shallow generated attention map cannot directly coincide with a disease-related region, which can hinder the work of the general attention network. Therefore, the method not only can locate the region related to classification, but also has excellent classification performance.

Claims

1. A MRI image classification method based on a task-driven hierarchical attention network is characterized by comprising the following steps:

the constructed information sub-network comprises an information extraction network and an image block classification network, wherein the information extraction network is used for classification, a convolution layer with a convolution kernel of 1 x 1 replaces a 3D CNN of a global average pooling layer and a full connection layer, and an information graph with the channel number of 1 is output through the information extraction network; the image block classification network is 3D CNN for classification;

the specific process of the information subnetwork training is as follows:

step 3-1, initializing network parameters in an information extraction network and an image block classification network;

3-2, randomly selecting R MRI images in the training set, and inputting the selected R MRI images into an initialized information extraction network to obtain an information graph with the number of channels being 1 corresponding to each MRI image, wherein R is a positive integer;

3-5, calculating a total loss function L, and reversely updating network parameters in the initialized information extraction network and the initialized image block classification network according to the total loss function L to respectively obtain an updated information extraction network and an image block classification network, so as to obtain an information subnetwork after one training;

L＝L_p-cls+L_cal

wherein L is_p-clsIn order to be a function of the first loss,

y_r,k,jis an indicator variable, y_r,k,jWhen the jth category of the kth image block selected from the r-th MRI image is the same as the category of the MRI image, y is 0 or 1_r,k,jIf not, then y_r,k,j＝0；c_r,k,jThe classification probability of the kth image block selected from the r MRI images belonging to the jth category; n is the total number of categories; l is_calIn order to be a function of the second loss,

f '═ M' < '> F, wherein M' ═ TI (M)

2. The MRI image classification method according to claim 1, characterized in that: the information map extraction network constructed in the step 3 is a 3D ResNet18 with global average pooling and full connection layers removed, and a convolution layer with three layers of convolution kernels 1 × 1 is added, and the constructed image block classification network is a 3D ResNet 10.

3. The MRI image classification method according to claim 1, characterized in that: the method for calculating the confidence of each image block in the step 3-4 comprises the following steps: and respectively carrying out one-hot coding on each category to which each image block belongs, and taking the classification probability corresponding to the category coded as 1 as the confidence coefficient of the image block.

4. The MRI image classification method according to claim 1, characterized in that: the hierarchical attention subnetwork constructed in the step 5 is a network with 3D ResNet34 as a backbone.