CN113781385A

CN113781385A - Joint attention-seeking convolution method for brain medical image automatic classification

Info

Publication number: CN113781385A
Application number: CN202110299393.0A
Authority: CN
Inventors: 朱旗; 张耕
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2021-03-19
Filing date: 2021-03-19
Publication date: 2021-12-10
Anticipated expiration: 2041-03-19
Also published as: CN113781385B

Abstract

The invention discloses a medical image automatic identification technology based on a graph convolution neural network. Based on brain diffusion tensor imaging and functional magnetic resonance imaging, a machine learning automatic diagnosis method which can fuse two modal data and reasonably reserve and utilize brain graph structure information is designed. The current brain image classification lacks an effective method for fusing multi-modal images, and has poor learning effect on the graph structure information in the data. The invention discloses a medical image automatic identification technology fusing diffusion tensor imaging and functional magnetic resonance imaging based on a graph convolution neural network. Specifically, neighbor information among nodes of the brain areas is aggregated on brain diffusion and relaxation imaging, and then brain area joint attention is extracted. And then performing graph convolution operation on the brain function magnetic resonance image by taking the attention of the brain region as a reference. And finally, inputting the characteristics into a multilayer perceptron to perform automatic identification and classification.

Description

Joint attention-seeking convolution method for brain medical image automatic classification

The technical field is as follows: machine learning field, in particular to automatic identification technology of brain medical images

Background art: the brain network is constructed through the correlation of brain intervals so as to observe the functional characteristics of the brain, and the technology is applied to the field of automatic identification of medical images through machine learning. Scientists in the field of machine learning have proposed many novel technical models for implementing automatic recognition analysis of medical images of the brain. The graph convolution neural network technology is a deep neural network technology suitable for graph structure data analysis processing. Researchers in the field of deep learning solve the problem of processing graph structure data by designing a graph convolution neural network framework.

Disclosure of Invention

Object of the Invention

The automatic identification technology of the medical image has an important auxiliary effect on the work of doctors, and the invention and the application of a good technology in the related field can greatly improve the diagnosis level of hospitals. The existing automatic identification method of the brain image has the defects of poor robustness and insufficient interpretability. The existing machine learning technology model cannot adapt to the image structure characteristics of the brain function network well, and a multi-mode framework organically integrating Functional Magnetic Resonance Imaging (FMRI) and Diffusion Tensor Imaging (DTI) is lacked.

In order to solve the problems, a technical scheme is explored, the FMRI and the DTI of the brain can be organically fused, the structural characteristics of a brain network diagram can be perfectly adapted, and the method needs to have robustness and good interpretability.

Technical scheme

In order to achieve the purpose, the technical scheme adopted by the invention comprises the following four steps:

the method comprises the steps that (I) a high-order functional brain network is built based on FMRI, the high-order functional brain network is built by brain region FMRI signal vectors, and the fully-connected brain network built among all brain regions based on sparse representation can better reflect high-order relations and potential implicit information on brain region functions.

And (II) constructing a structural network for extracting brain region attention information based on DTI, and constructing a brain network graph by taking DTI structural data as edges between nodes and taking FMRI as node feature vectors in order to extract brain network attention weight.

And (III) calculating first-order, second-order and third-order attention scores of the brain regions through node aggregation, extracting joint attention on a DTI mode and an FMRI mode, and using the joint attention on the pooling of the GCN. The method for calculating the attention does not need to set additional parameters, so that the calculation is very convenient. The attention score extracted by the above equation is then applied to the pooling layer of the GCN. We measure the strength of the functional activity of each edge on the structural network represented by the DTI, and then evaluate the attention score for each brain region node. If the attention scores of the nodes on the structural network are higher, we can understand that the brain area carries out more functional activities through the edges and other brain areas in the DTI structural network.

And (IV) executing graph volume and pooling operation and classifying the output characteristics by using a multilayer perceptron to obtain a final identification result. And (4) sorting the nodes by combining the joint attention scores learned in the DTI and FMRI modes, screening out unimportant nodes and reserving the nodes with high attention. In the lower sampling middle pooling layer of the GCN, the fusion efficiency of the high-order neighbors of the nodes is improved by adopting a mode of discarding the nodes layer by layer, and the method breaks through the structural integrity of a brain network graph, so that the lower network lacks the capability of fusing all the nodes. Therefore, a reading layer is added behind each pooling layer of the GCN, the reading layer completes one-time global information aggregation on the graph node of the current layer, and the result is fed back to the multi-layer perceptron for final classification.

Drawings

FIG. 1 construction of a high-order functional brain network diagram

FIG. 2 DTI structural brain network map construction

FIG. 3 aggregation of node attention

Figure 4 GCN network structure diagram

Detailed Description

The following describes an implementation of the technical scheme specifically:

the work firstly constructs a high-order brain network by using a sparse representation method, and the high-order brain network is used as side information in a brain network graph. And the original signal vector of the FMRI time sequence of the brain area is used as node information, so that a complete graph structure is defined. Different from the traditional brain network classification diagnosis method and framework, the work constructs a high-order graph functional brain network by using brain FMRI data, simultaneously constructs a structural brain network graph by using DTI structural data as edges between nodes and FMRI as a node feature vector, and is used for extracting attention weight of a brain region.

Construction of high-order functional brain network based on FMRI

By X ═ X₁，x₂，...，x₉₀]∈R^240×90To represent the feature vector of the node in the brain network graph, i.e. the FMRI data of a single sample in the sample set, where 90 is the number of brain regions corresponding to each sample, x_iThe FMRI signature direction of the ith brain region of the sample. 240 is the length of the brain region feature vector.

A brain region FMRI signal vector is used for constructing a high-order functional brain network, and the high-order relation and potential implicit information on brain region functions can be better embodied by the fully-connected brain network constructed based on sparse representation among all brain regions.

In the expression A_iRepresenting a dictionary corresponding to the ith brain region, consisting of the features of the other N-1 brain regions, A_iThe value of the element in the midbrain region corresponding to the ith column is zero. E_iIs a 90 × 1 vector, representing the indicated vector of the dictionary representation corresponding to the ith brain region, here representing the weight of the edge between the brain region node i and other nodes. By calculating the indication vectors of the sparse representation corresponding to all brain regions, an indication vector matrix can be obtained:

E＝{E₁，E₂，......，E₉₀}

e is the adjacency matrix of the graph, representing the edges of the higher-order functional brain network graph. Since E is constructed using a sparse representation, the values of many elements in the matrix are zero, which means that there is no edge connection between corresponding nodes. The process of constructing a high-order functional brain network diagram is shown in FIG. 1. We name this network map as a higher order functional brain network map.

(II) constructing a structural network for extracting brain region attention information based on DTI

Meanwhile, in order to extract attention weight of the brain network, DTI structural data is used as edges between nodes, and FMRI is used as a node feature vector to construct a brain network graph. As shown in figure 2, the edge matrix of the brain network graph constructed by DTI structural information

It is shown that the characteristics of the nodes are X ═ X as in the functional brain network diagram₁，x₂，...，x₉₀]. We name this network map as a structural brain network map. The purpose of constructing the structural brain network graph is to extract attention weights of the brain region nodes.

(III) calculating the first-order, second-order and third-order attention scores of the brain areas through node aggregation

This section describes how we extract joint attention on the DTI and FMRI modes and use it on pooling of GCNs. Firstly, in the first two steps, we define a DTI structure brain network graph with FMRI signal vectors as node characteristics, wherein the node characteristics of the graph are represented by X, and the edges of the graph are represented by X

And (4) showing. We adopt the mode of aggregating nodes to obtain the node joint attention score Z.

Wherein<x_i，x_j>The inner product of the node i and the node j represents the correlation calculation of the characteristics of the two nodes and reflects the functional connection between brain regions on the edge of the brain network diagram. This means that we measure the strength of the functional activity of each edge on the structural network represented by the DTI, and thus evaluate the attention score for each brain region node. If the attention scores of the nodes on the structural network are higher, we can understand that the brain area carries out more functional activities through the edges and other brain areas in the DTI structural network. Attention score for layer 1 polymerization we used Z^lTo express, the calculation method of the l +1 th layer is:

the method for calculating the attention does not need to set additional parameters, so that the calculation is very convenient. The attention score extracted by the above equation is then applied to the pooling layer of the GCN. Node screening We adopt a topK mechanism

(IV) executing graph volume and pooling operation and classifying output characteristics by a multilayer perceptron to obtain a final identification result

The pooling mechanism of TopK is to screen out the most valuable information by following the idea of maximal pooling operation in CNN. In the work, the nodes are sorted by combining the joint attention scores learned in the DTI mode and the FMRI mode, unimportant nodes are screened out, and the nodes with high attention are reserved.

After the subgraph division and the corresponding adjacency matrix are determined, the information without the grid subgraph needs to be integrated and extracted, and in the last step, an aggregation method of node attention is defined. We have first introduced the node aggregation approach of convolutional layers.

h＝GNN(X，E)

Specific can be written as follows:

where W is the parameter to be trained,

is that

The degree matrix of (c). Layer 1+1 is defined as

Because the pooling layer adopts a topK method in the downsampling of the GCN, the fusion efficiency of the high-order neighbors of the nodes is improved in a mode of discarding the nodes layer by layer, the method breaks through the structural integrity of a brain network graph, and the lower-layer network lacks the capability of fusing all the nodes. Therefore, a reading layer is added behind each pooling layer of the GCN, the reading layer completes one-time global information aggregation on the graph node of the current layer, and the result is fed back to the multi-layer perceptron for final classification. The readout layer calculation is as follows:

the above equation represents the result of stitching together the global average pooling and the global maximum pooling to get the readout layer, | | | represents the stitching operation. Max pooling is a method commonly used in neural networks to extract features and reduce the effects of unwanted information, and mean pooling is a supplement to max pooling to preserve background information and overfitting due to uneven distribution of feature data. The readout layer structure of each layer is added to give a representation of the overall graph:

the role of the readout layer is similar to the global pooling of convolutional layers commonly used in CNN network models, and it is their common feature to obtain global expression by a one-time aggregation of global inputs. Also, similar to global pooling in CNN, the readout layer may also employ common operations of summing, averaging, maximizing, etc. The resulting GCN network comprising the readout layer is shown in FIG. 4

Aggregating neighbor nodes on a DTI structural brain network diagram to obtain combined attention in DTI and FMRI modes, then pooling unimportant brain area nodes by using topK according to the attention value, setting a readout layer behind each pooling layer to aggregate information of global nodes in order to learn global information of each layer, and finally adding the readout layer results of each layer and transmitting the readout layer results to a multilayer perceptron for classification.

Claims

1. A joint attention-seeking convolution method for brain medical image automatic classification is characterized in that: the method comprises the following steps:

the method comprises the steps that (I) a high-order functional brain network is built based on FMRI, the high-order functional brain network is built by using a sparse representation method through correlation among FMRI signal vectors in a brain area, and an original vector is used as a node feature of a network node.

And (III) calculating first-order, second-order and third-order attention scores of the brain regions through node aggregation, extracting joint attention on a DTI mode and an FMRI mode, and using the joint attention on the pooling of the GCN.

And (IV) executing graph volume and pooling operation and classifying the output characteristics by using a multilayer perceptron to obtain a final identification result. And (4) sequencing the nodes by combining joint attention scores learned in two modes of DTI and FMRI, screening out unimportant nodes by using a TopK strategy, and reserving the nodes with high attention.

2. The method for convolution of joint attention maps for automatic classification of brain medical images according to 1 is characterized in that a high-order functional brain network map described in the first step is constructed in a manner that the original vectors are used as node features of network nodes after a brain network is constructed by using a sparse representation method.

3. The method for convolving joint attention maps of brain medical image automatic classification according to 1 is characterized in that the brain structure network map described in the second step is constructed in a manner that DTI is used as the edge of the brain network map and FMRI vector is used as the node feature of the network node.

4. According to the method for the convolution of the joint attention map for the automatic classification of the medical brain image in the step 1, which is characterized in that the method for extracting the attention of the brain region nodes described in the step three is adopted, the invention innovatively calculates the importance degree of the brain region in a mode of using the node aggregation of graph convolution.

5. The method for the convolution of the joint attention map of the brain medical image automatic classification according to 1 is characterized in that the process of the brain structure and function information learning and the feature extraction described in the step four is characterized in that a functional network and a structural network are fused in a graph convolution neural network by simultaneously utilizing two modal data of DTI and FMRI, so that the performance of automatic diagnosis is improved.