CN113205055A

CN113205055A - Fungus microscopic image classification method and system based on multi-scale attention mechanism

Info

Publication number: CN113205055A
Application number: CN202110509387.3A
Authority: CN
Inventors: 许鸿雁
Original assignee: Beijing Zhijian Life Technology Co ltd
Current assignee: Chongqing Zhijian Life Technology Co ltd
Priority date: 2021-05-11
Filing date: 2021-05-11
Publication date: 2021-08-03
Anticipated expiration: 2041-05-11
Also published as: CN113205055B

Abstract

The invention provides a fungus microscopic image classification method and system based on a multi-scale attention mechanism, which comprises the following steps: obtaining a training sample, wherein the training sample comprises a plurality of fungus microscopic images, and each fungus microscopic image has a corresponding fungus category label; constructing a deep learning image classification model comprising an attention module, training the deep learning image classification model by using the training sample, and taking the deep learning image classification model after training as a fungus image classification model; and inputting the fungus microscopic image to be classified into the fungus image classification model to obtain the fungus category. According to the invention, the attention module is added into the network, so that the network focuses on the area where the fungi are located, and the influence of the background area is ignored as much as possible. Thereby more accurately identifying the same genus and different species of fungi with smaller morphological differences.

Description

Fungus microscopic image classification method and system based on multi-scale attention mechanism

Technical Field

The invention relates to the technical field of image classification in computer vision, in particular to a fungus microscopic image classification method and system based on a multi-scale attention mechanism.

Background

The data show about 200 million natural fungi and about 560 human pathogenic fungi. The treatment cost of the fungi is about 26 hundred million dollars per year. Tens of millions of medical fungal infectors exist worldwide each year, and deep fungal infection caused by pathogens causes 150 million deaths each year, thus seriously threatening human health.

How to rapidly and accurately identify medical fungal infection, especially fatal fungal infection, is a difficult problem to be overcome in world medicine. In clinic, the category of pathogenic fungi is judged mainly by biochemical identification, and the whole process can last for 4-10 days. The existing fungus classification and identification technology based on deep learning can only classify a limited number of fungi, and simultaneously has the problems of huge required training data set and low classification accuracy.

For the identification and classification of the fungal microscopic image, the main technical difficulties faced at present are:

there is a certain difference between the morphological size of yeast and filamentous fungi, and yeast is usually small in size and only occupies a small region in the image. Whereas filamentous fungi are relatively larger;

morphological differences between the same genus and different species of fungi are small, for example: candida glabrata and Candida guilliermondii, the species of which are difficult to distinguish from the image only when observed by human eyes.

Disclosure of Invention

In view of the above, the present invention designs a Spatial atom module in the SE module. Specifically, the module connects output feature maps of the convolution of the holes with the three hole rates of 3, 6 and 9 respectively, and inputs the connected output feature maps into the next convolution layer. The method has the advantages that for the cavity convolution with the cavity rate of 3, the size of the output feature graph is large, the receptive field is small, and the feature graph contains rich local detail information; for the cavity convolution with the cavity rate of 9, the size of the output feature graph is small, the receptive field is large, and the feature graph contains rich global detail information. The invention fuses three characteristic graphs with different sizes output by the cavity convolution with different cavity rates, and can obtain rich global information and local information, namely, rich multi-scale information. Such a network is robust to different sizes of objects to be examined (fungi). In addition, the SE module is added into the network, so that the network focuses on the area where the fungi are located, and the influence of the background area is ignored as much as possible. Thereby more accurately identifying the same genus and different species of fungi with smaller morphological differences.

Specifically, the invention provides a fungus microscopic image classification method based on a multi-scale attention mechanism, which comprises the following steps:

step 1, obtaining a training sample, wherein the training sample comprises a plurality of fungus microscopic images, and each fungus microscopic image has a corresponding fungus category label;

step 2, constructing a deep learning image classification model comprising an attention module, training the deep learning image classification model by using the training sample, and taking the deep learning image classification model after training as a fungus image classification model;

and 3, inputting the microscopic image of the fungus to be classified into the fungus image classification model to obtain the fungus class of the fungus.

The fungus microscopic image classification method based on the multi-scale attention mechanism is characterized in that the deep learning image classification model comprises convolution layers connected in series, and the output of each convolution layer is connected with the input end of the attention module.

The fungus microscopic image classification method based on the multi-scale attention mechanism comprises a deep learning image classification model, a full connection layer and an activation layer.

According to the fungus microscopic image classification method based on the multi-scale attention mechanism, the fungus microscopic image of the deep learning image classification model is input, image features are extracted through a plurality of convolution layers, the image features are input into an attention module with cavity convolution, the attention of a deep learning network is focused in the area where the fungus is located, and the fungus category is obtained after the fungus is subjected to a layer of convolution layer.

The fungus microscopic image classification method based on the multi-scale attention mechanism is characterized in that the attention module connects output feature maps of convolution of a plurality of cavities with different cavity rates to obtain multi-scale fusion features, and then the multi-scale fusion features are input to the most terminal convolution layer in the deep learning image classification model.

The invention also provides a fungus microscopic image classification system based on a multi-scale attention mechanism, which comprises:

the module 1 is used for acquiring a training sample, wherein the training sample comprises a plurality of fungus microscopic images, and each fungus microscopic image is provided with a corresponding fungus class label;

the module 2 is used for constructing a deep learning image classification model comprising an attention module, training the deep learning image classification model by using the training sample, and taking the deep learning image classification model after training as a fungus image classification model;

and the module 3 is used for inputting the fungus microscopic image to be classified into the fungus image classification model to obtain the fungus category.

The fungus microscopic image classification system based on the multi-scale attention mechanism is characterized in that the deep learning image classification model comprises convolution layers connected in series, and the output of each convolution layer is connected with the input end of the attention module.

The fungus microscopic image classification system based on the multi-scale attention mechanism is characterized in that the deep learning image classification model comprises an average pooling layer, a full-link layer and an activation layer.

According to the fungus microscopic image classification system based on the multi-scale attention mechanism, the fungus microscopic image of the deep learning image classification model is input, image features are extracted through a plurality of convolution layers, the image features are input into an attention module with cavity convolution, the attention of a deep learning network is focused in the area where the fungus is located, and the fungus category is obtained after the fungus is subjected to a layer of convolution layer.

The fungus microscopic image classification system based on the multi-scale attention mechanism is characterized in that the attention module connects output feature maps of convolution of a plurality of cavities with different cavity rates to obtain multi-scale fusion features, and then the multi-scale fusion features are input to the most terminal convolution layer in the deep learning image classification model.

According to the scheme, the invention has the advantages that:

1. the method is not limited to simply distinguishing the yeasts and the filamentous fungi, but can distinguish 11 types of homologous and heterogeneous yeasts and filamentous fungi, including candida glabrata, candida lipolytica, candida parapsilosis, candida mondii, candida krusei, staphylococcus, candida tropicalis, cryptococcus neoformans, aspergillus niger, aspergillus fumigatus and aspergillus versicolor. Therefore, the pathogenic bacteria species are more likely to be identified clinically, thereby accelerating the introduction of antifungal drugs, shortening the recovery time of patients and relieving the pain of the patients.

2. The designed image classification algorithm is used as a universal method, can be suitable for various different tasks, and has a general application value.

Drawings

FIG. 1 is an exemplary diagram of sub-categories in a data set;

FIG. 2 is a diagram of a deep learning model architecture;

FIG. 3 is a diagram of the FCL module structure for fungus classification;

FIG. 4 is an expanded view of the scaled SE module of FIG. 3.

Detailed Description

In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.

The technical scheme comprises the following three main technical processes: and (4) data set making, model training and optimizing and classification result outputting.

1. Data set production

In this example, the present invention performs cropping on a microscopic image of the fungus under a primary culture 48H, 100X microscope to expand the data set. The data set contains a total of 11 categories of sibling and xenogeneic yeast and filamentous fungi, and an exemplary image is shown in FIG. 1.

2. Model training and optimization

And after the data set is manufactured, inputting the data set into a deep learning classification network, and training and optimizing the model.

The architecture of the deep learning model used is shown in fig. 2. On the basis of the lightweight network MobileNet V2 network, a Squeeze-and-excitation (SE) module is introduced, and the mutual dependence relationship among characteristic channels can be explicitly modeled based on the attention mechanism of image channels, so that the characteristic recalibration is realized. Since fungi usually only account for a small part of microscopic images of fungi samples, most of the images are background parts. Therefore, in order to focus the network on the target, namely the area where the fungi are located, the SE module comprises Squeeze and Excitation, wherein the Squeeze comprises a global pooling layer, and global spatial information is compressed into a channel descriptor to obtain a global receptive field; the latter contains two fully connected layers to fully capture the channel correlation.

In order to further expand the receptive field of the image in the network and extract multi-scale context information, and simultaneously, the calculation amount is not increased remarkably, the method introduces hole convolution in the SE module. In the convolutional neural network, the receptive field is an important influence factor, which is the area size of the pixel points on the feature map output by each layer of the convolutional neural network, which are mapped on the original image, and can reflect the amount of the feature information extracted by the network. To increase the field of view, the common method is to reduce the image size by pooling and then restore the image size by upsampling, but this causes loss of image detail information and thus a decrease in network accuracy. Therefore, hole convolution becomes a better alternative. The hole convolution introduces a new parameter called "void rate" to the convolutional layer, so that under the same feature map, the hole convolution can obtain a larger receptive field and obtain more dense data. The identification precision of the network to small objects is improved due to the increase of the receptive field.

The lightweight network MobileNetV2 network adopted by the invention comprises:

and 1, an average pooling layer for reducing the parameter quantity of the network and accelerating network calculation. The convolution kernel parameters are reduced from (number of input channels x number of output channels x filtering range) to (number of output channels).

And 2> a full connection layer (FC) for analyzing network characteristics and distinguishing the importance of each layer output.

And 3> an activation layer (Relu) for introducing a nonlinear factor into the network, improving the expression capability of the neural network on the model and solving the problem which cannot be solved by the linear model.

The structure of the Fungal Classification (FCL) module designed by the present invention is shown in fig. 3. The Batch Normalization (BN), average pooling layer, full connectivity layer and activation layer are omitted from the figure for simplicity. After image features of the input fungus image are extracted through the three convolutional layers, the fungus image is input into an SE module with cavity convolution, so that the attention of a network is focused on the region where the fungus is located, and the result is output after the fungus image passes through one convolutional layer.

The related SE block of FIG. 3 is shown in detail in FIG. 4. It takes the output of Conv3 convolutional layer in FIG. 3 as input, and first goes through the global pooling layer and Squeeze operation to compress the global spatial information into a channel descriptor. And then passed through the designed Spatial Atrous module. The module is shown in detail in the right box of fig. 4. The module connects output characteristic graphs of the cavity convolutions with the three cavity rates of 3, 6 and 9 respectively, and then performs the Excitation. Therefore, for the cavity convolution with the cavity rate of 3, the size of the output characteristic diagram is large, the receptive field is small, and rich local detail information is contained; for the cavity convolution with the cavity rate of 9, the size of the output characteristic graph is small, and the receptive field is large so as to contain rich global detail information. By fusing the characteristic diagrams of three sizes output by the cavity convolution with different cavity rates, rich global information and local information can be obtained, namely rich multi-scale information is fused. This makes it more robust to different sizes of objects to be examined (fungi). The signature graph output by the Spatial atom module is further subjected to an Excitation operation, which includes two fully-connected layers to fully capture the channel correlation. Finally, the output of the entire scaled SE module is input to the Conv4 convolution layer in fig. three, and the result is output after the class prediction.

And after the model is trained on the training set and the result is output on the verification set, carrying out fine adjustment on network parameters and continuously iterating and optimizing the model until the classification result meets the standard. And finally, carrying out class prediction on the test set by using the model after the optimization training is finished, and finally obtaining a classification result.

3. Output of classification result

And after the test set data is classified by the model, outputting a classification result.

The following are system examples corresponding to the above method examples, and this embodiment can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the above-described embodiments.

Claims

1. A fungus microscopic image classification method based on a multi-scale attention mechanism is characterized by comprising the following steps:

2. The method of claim 1, wherein the deep learning image classification model comprises convolutional layers connected in series, and an output of the convolutional layers connected in series is connected to an input of the attention module.

3. The fungal microscopic image classification method based on multi-scale attention mechanism according to claim 1, characterized in that the deep learning image classification model comprises an average pooling layer, a full junction layer and an activation layer.

4. The method for classifying the fungal microscopic image based on the multi-scale attention mechanism as claimed in claim 3, wherein the fungal microscopic image input into the deep learning image classification model is subjected to extraction of image features through a plurality of convolutional layers, and the image features are input into an attention module with cavity convolution, so that the attention of a deep learning network is focused on a region where the fungi are located, and the fungal category is obtained after one convolutional layer.

5. The method for classifying fungal microscopic images based on multi-scale attention mechanism as claimed in any one of claims 1 to 4, wherein the attention module connects the convolved output feature maps of a plurality of cavities with different cavity rates to obtain multi-scale fusion features, and inputs the multi-scale fusion features to the endmost convolution layer in the deep learning image classification model.

6. A fungal microscopic image classification system based on a multi-scale attention mechanism is characterized by comprising:

7. The system of claim 1, wherein the deep learning image classification model comprises convolutional layers connected in series, and wherein outputs of the convolutional layers in series are connected to inputs of the attention module.

8. The fungal microscopic image classification system based on multi-scale attention mechanism according to claim 1, wherein the deep learning image classification model comprises an average pooling layer, a full junction layer and an activation layer.

9. The system of claim 3, wherein the fungal microscopic image classification system based on the multi-scale attention mechanism is characterized in that the fungal microscopic image input into the deep learning image classification model is subjected to extraction of image features through a plurality of convolutional layers, the image features are input into an attention module with cavity convolution, the attention of a deep learning network is focused on a region where fungi are located, and the fungal classification is obtained after the fungal microscopic image passes through one convolutional layer.

10. The system according to any one of claims 6 to 9, wherein the attention module obtains multi-scale fusion features by connecting convolved output feature maps of a plurality of holes with different void rates, and inputs the multi-scale fusion features to the endmost convolution layer in the deep learning image classification model.