CN113793326A

CN113793326A - Disease identification method and device based on image

Info

Publication number: CN113793326A
Application number: CN202111109909.7A
Authority: CN
Inventors: 赵建春; 周阳; 林海澜; 丁大勇
Original assignee: Beijing Vistel Technology Co ltd
Current assignee: Beijing Vistel Technology Co ltd
Priority date: 2021-09-18
Filing date: 2021-09-18
Publication date: 2021-12-14

Abstract

The embodiment of the application discloses a disease identification method and device based on images, wherein the method comprises the following steps: the N first modal feature maps and the M second modal feature maps are subjected to H times of feature fusion respectively to obtain H fusion modal feature maps to be identified, the H fusion modal feature maps to be identified are classified respectively according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories, and the probability value represents the possibility of each category of diseases in the preset categories. Therefore, at least two modal characteristic images are fused, automatic detection is carried out on the disease category according to the fused modal characteristic images, manual detection is avoided, the accuracy of image disease category detection is improved, namely, the disease category detection efficiency is improved, and meanwhile, the condition and the disease category of the image and the manually marked image are prevented from being manually selected.

Description

Disease identification method and device based on image

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for identifying a disease based on an image.

Background

At present, the fundus disease types are classified based on fundus color photographs and OCT images in clinical medicine, and the classification means mainly comprises manual labeling, namely, the classification means is used for determining the disease types contained in the fundus color photographs and the OCT images by performing manual labeling on the photographed images and then judging the disease types.

The existing multi-mode automatic detection of fundus diseases comprises fundus color photography and OCT image processing, but only one OCT image is input and needs to be selected manually, the OCT image is often a sequence, a lot of information of an OCT mode can be lost by using only one OCT image, and the obtained OCT image information is marked by manpower, so that the existing data cannot be fully utilized, and the detection accuracy is not high.

Disclosure of Invention

In view of the above technical problems, embodiments of the present application provide a method and an apparatus for identifying a disease based on an image, so as to solve the problem that the existing data cannot be fully utilized by the above method, so that the detection accuracy is not high.

A first aspect of an embodiment of the present application provides an image-based disease identification method, including:

acquiring N first modal feature maps and M second modal feature maps, wherein N and M are integers which are more than or equal to 1, the N first modal feature maps are respectively feature maps of N pairs of first modal sub-images, and the N pairs of first modal sub-images are obtained according to the first modal images;

respectively performing H times of feature fusion on the N first modal feature maps and the M second modal feature maps to obtain H fusion modal feature maps to be identified, wherein H is an integer greater than or equal to 2, the H fusion modal feature maps to be identified all contain the first modal feature and the second modal feature, and weights of the same modal feature in the H fusion modal feature maps to be identified are different;

classifying the H fusion modal characteristic graphs to be identified according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories, wherein the probability value represents the possibility of each category of diseases in the preset categories.

In one embodiment, the performing feature fusion on the N first modality feature maps and the M second modality feature maps for any fused modality feature map to be identified includes:

respectively converting the N first modal feature maps into the N third modal feature maps, and respectively converting the M second modal feature maps into the M fourth modal feature maps, wherein the N third modal feature maps and the M fourth modal feature maps are comparable in modality;

configuring first initial weights for the N third modal characteristic diagrams respectively, and configuring second initial weights for the M fourth modal characteristic diagrams respectively, wherein each first initial weight represents the importance of the corresponding third modal characteristic diagram, and each second initial weight represents the importance of the corresponding fourth modal characteristic diagram;

and fusing the N third modal characteristic graphs and the M fourth modal characteristic graphs according to the first initial weight and the second initial weight to obtain a fused modal characteristic graph to be identified.

In an embodiment, the fusing the N third modality feature maps and the M fourth modality feature maps according to the first initial weight and the second initial weight to obtain a fused modality feature map to be identified includes:

respectively calculating the multiplication result of the N third modal characteristic graphs and the corresponding first initial weights, and respectively calculating the multiplication result of the M fourth modal characteristic graphs and the corresponding second initial weights;

and adding all multiplied results to obtain the fusion modal characteristic diagram to be identified.

In one embodiment, the method comprises:

respectively obtaining a first importance coefficient and a second importance coefficient of each first modal feature map in the N first modal feature maps, wherein the first importance coefficient represents the importance of the corresponding first modal feature map, and the second importance coefficient represents the region importance of the corresponding first modal feature map;

and generating a first disease prediction graph according to the first importance coefficient and the second importance coefficient of each first modality feature graph, wherein the first disease prediction graph represents a disease expression graph under a first modality.

In one embodiment, the method further comprises:

respectively obtaining a first importance coefficient and a second importance coefficient of each second modal feature map in the M second modal feature maps, wherein the first importance coefficient represents the importance of the corresponding second modal feature map, and the second importance coefficient represents the region importance of the corresponding second modal feature map;

and generating a second disease prediction graph according to the first importance coefficient and the second importance coefficient of each second modality feature graph, wherein the second disease prediction graph represents a disease expression graph under a second modality.

In one embodiment, for any first modality signature, obtaining a first importance coefficient of the corresponding first modality signature comprises:

and obtaining an average value of H first initial weights corresponding to the first modal feature map to obtain a first importance coefficient of the first modal feature map, wherein the H first initial weights are obtained in the process of performing H times of feature fusion on the N first modal feature maps and the M second modal feature maps.

In one embodiment, the N sub-images of the first modality are derived from the first modality image, including:

and cutting the first modality image to obtain N1 first modality sub-images, respectively turning over the N1 first modality sub-images to obtain N2 first modality sub-images, wherein the sum of N1 and N2 is N.

In one embodiment, the acquiring N first modality feature maps and M second modality feature maps includes:

performing feature extraction on the N pairs of first modal sub-images through a first feature extraction model to obtain the N first modal feature maps and a second importance coefficient of each first modal feature map;

and performing feature extraction on the M pairs of second modal sub-images through a second feature extraction model to obtain the M second modal feature maps and a second importance coefficient of each second modal feature map.

In an embodiment, the classifying the H fusion modality feature maps to be identified according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories includes:

and if the corresponding probability value is greater than 0.5, determining the disease type corresponding to the corresponding probability value as the target type.

A second aspect of an embodiment of the present application provides an image-based disease recognition apparatus, including:

the image acquisition module is used for acquiring N first modal feature maps and M second modal feature maps, wherein N and M are integers which are more than or equal to 1, the N first modal feature maps are respectively feature maps of N pairs of first modal sub-images, and the N pairs of first modal sub-images are obtained according to the first modal images;

the image fusion module is used for respectively performing H-time feature fusion on the N first modal feature maps and the M second modal feature maps to obtain H fusion modal feature maps to be identified, wherein H is an integer greater than or equal to 2, the H fusion modal feature maps to be identified all contain the first modal feature and the second modal feature, and weights of the same modal feature in the H fusion modal feature maps to be identified are different;

and the image classification module is used for classifying the H fusion modality feature maps to be identified respectively according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories, and the probability value represents the possibility of each category of diseases in the preset categories.

In the embodiment of the application, H times of feature fusion are respectively carried out on the N first modal feature maps and the M second modal feature maps to obtain H fused modal feature maps to be recognized, the H fused modal feature maps to be recognized all contain the first modal feature and the second modal feature, and the same modal feature has different weights in the H fused modal feature maps to be recognized, so that at least two modal feature images are fused, the disease category is automatically detected according to the fused modal feature images, manual detection is avoided, the H fused modal feature maps to be recognized are respectively classified according to preset disease categories to obtain the probability value corresponding to each disease category in the preset disease categories, and the probability value represents the possibility of diseases in each category in the preset categories, the image disease category detection accuracy is improved, the disease category detection efficiency is improved, and meanwhile, the condition and the disease category of the image selected manually and the image marked manually are avoided.

Drawings

The features and advantages of the present application will be more clearly understood by reference to the accompanying drawings, which are illustrative and not to be construed as limiting the present application in any way, and in which:

FIG. 1 is a schematic flow chart of an image-based disease identification method provided by the present application;

fig. 2 is a schematic diagram illustrating a fusion flow of specific two modality feature maps of the image-based disease identification method provided by the present application.

Detailed Description

In the following description, numerous specific details of the present application are set forth by way of examples in order to provide a thorough understanding of the relevant disclosure. It will be apparent, however, to one skilled in the art that the present application may be practiced without these specific details. It should be understood that the use of the terms "system," "apparatus," "unit" and/or "module" herein is a method for distinguishing between different components, elements, portions or assemblies at different levels of sequential arrangement. However, these terms may be replaced by other expressions if they can achieve the same purpose.

It will be understood that when a device, unit or module is referred to as being "on" … … "," connected to "or" coupled to "another device, unit or module, it can be directly on, connected or coupled to or in communication with the other device, unit or module, or intervening devices, units or modules may be present, unless the context clearly dictates otherwise. For example, as used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application. As used in the specification and claims of this application, the terms "a", "an", and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" are intended to cover only the explicitly identified features, integers, steps, operations, elements, and/or components, but not to constitute an exclusive list of such features, integers, steps, operations, elements, and/or components.

These and other features and characteristics of the present application, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, will be better understood upon consideration of the following description and the accompanying drawings, which form a part of this specification. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the application. It will be understood that the figures are not drawn to scale.

The application describes an image-based disease identification method, as shown in fig. 1, comprising:

step S10 obtains N first modality feature maps and M second modality feature maps, where N and M are both integers greater than or equal to 1, where the N first modality feature maps are feature maps of N pairs of first modality sub-images, respectively, and the N pairs of first modality sub-images are obtained according to the first modality image.

Step S20 performs H times of feature fusion on the N first modal feature maps and the M second modal feature maps, respectively, to obtain H fusion modal feature maps to be identified, where H is an integer greater than or equal to 2, each of the H fusion modal feature maps to be identified includes the first modal feature and the second modal feature, and weights of the same modal feature in the H fusion modal feature maps to be identified are different from each other.

Step S30 is to classify the H fusion modality feature maps to be identified according to preset disease categories, to obtain a probability value corresponding to each disease category in the preset disease categories, where the probability value represents the possibility of each category of disease in the preset category.

Therefore, by adopting the implementation mode, at least two modal characteristic images are fused, the disease category is automatically detected according to the fused modal characteristic images, manual detection is avoided, the accuracy of image disease category detection is improved, namely, the efficiency of disease category detection is improved, and meanwhile, the condition and the disease category of the image and the manually marked image are prevented from being manually selected.

The N first modal feature maps and the M second modal feature maps are obtained through the step S10, wherein N and M are integers larger than or equal to 1, the number of the first modal feature maps is the same as that of the second modal feature maps, and the N subsequent first modal feature maps and the M subsequent second modal feature maps are more uniformly fused due to the feature maps with the same number, so that the obtained fused features are more obvious, and the accuracy of disease identification is improved. Of course, in other embodiments, the number of the first modality feature maps and the number of the second modality feature maps may be different.

Step S20 performs H feature fusion on the N first modal feature maps and the M second modal feature maps, respectively, to obtain H fusion modal feature maps to be identified, thereby implementing fusion between two different modal features, implementing information complementation between the two modal feature maps through fusion of the two different modal feature maps, and being more rapid and beneficial to subsequent identification of disease species, and enabling fusion between the two modal feature maps to be more sufficient.

The probability value corresponding to each disease category in the preset disease categories is obtained through the step S30, the probability value represents the possibility of each category disease in the preset categories, and finally the disease category is determined, so that manual detection is avoided, the accuracy of image disease category detection is improved, the disease category detection efficiency is improved, and meanwhile, the condition and the disease category of the image selected manually and the image marked manually are avoided.

Referring to fig. 1 and fig. 2, in some embodiments, the feature fusing the N first modality feature maps and the M second modality feature maps for any fused modality feature map to be identified includes: respectively converting the N first modal feature maps into the N third modal feature maps, and respectively converting the M second modal feature maps into the M fourth modal feature maps, wherein the N third modal feature maps and the M fourth modal feature maps are comparable in modality.

Specifically, the application is a new module for multi-modal feature fusion by completing image fusion and classification through an MM-MIL module, wherein the MM-MIL module inherits interpretability of MIL based on example attention, and fuses example-level CFP/OCT features and example attention weights into example-level feature vectors, namely the N first-modal feature maps and the M second-modal feature maps are subjected to feature fusion to form vector feature maps, and the MM-MIL module is divided into two parts: the first part is a cross-modality mapping part. The cross-modality mapping part receives input (n × d) of two modalities, namely input (n × d) of first modality feature maps and input (n × d) of second modality feature maps, and the input (n × d) of the first modality feature maps and the input (n × d) of the second modality feature maps of the two modalities are respectively standardized through a full connection layer and a full connection layer to remove modality information of different modalities, so that the modalities are comparable, and further fusion of the two modality maps is realized.

And respectively configuring first initial weights for the N third modal characteristic graphs, and respectively configuring second initial weights for the M fourth modal characteristic graphs, wherein each first initial weight represents the importance of the corresponding third modal characteristic graph, and each second initial weight represents the importance of the corresponding fourth modal characteristic graph. And fusing the N third modal characteristic graphs and the M fourth modal characteristic graphs according to the first initial weight and the second initial weight to obtain a fused modal characteristic graph to be identified.

Specifically, the second part of the MM-MIL module is an example attention calculation part, the example attention calculation part splices two features output by the first part to obtain a feature with a size of 2n × d, a weight of each example is obtained after two linear layers and activation function layers, the size of the weight is 2n × 1, and the obtained weight of each example is multiplied by each original example feature to obtain a module output with a final size of 1 × d. By using a plurality of MM-MIL modules, the feature emphasis points learned by each module are different, namely the features obtained by the fusion modal feature map to be recognized are different, the model can be learned more comprehensively by combining the information of each module, the decision is more accurate, and the obtained disease type is more accurate.

In an embodiment, fusing the N third modality feature maps and the M fourth modality feature maps according to the first initial weight and the second initial weight to obtain a fused modality feature map to be identified, including: and respectively calculating the result of multiplying the N third modal characteristic maps by the corresponding first initial weights, and respectively calculating the result of multiplying the M fourth modal characteristic maps by the corresponding second initial weights. And adding all multiplied results to obtain the fusion modal characteristic diagram to be identified.

Specifically, the MM-MIL module calculates the result of multiplying the N third modal feature maps by the corresponding first initial weights, and calculates the result of multiplying the M fourth modal feature maps by the corresponding second initial weights, so as to obtain the fused modal feature map to be identified, which is further divided into two steps: firstly, performing element-by-element multiplication on modal features (2n × d) obtained by an example attention score calculating part and features (2n × 1) obtained by performing two times of linear layers and activation function layers to obtain the weight of each example, and obtaining features with the size of 2n × d; the second step adds the features of the first step in a first dimension (length 2n) to obtain features of size 1 × d as the output of the module.

In one embodiment, the method comprises: respectively obtaining a first importance coefficient and a second importance coefficient of each first modal characteristic diagram in the N first modal characteristic diagrams, wherein the first importance coefficient represents the importance of the corresponding first modal characteristic diagram, and the second importance coefficient represents the region importance of the corresponding first modal characteristic diagram. The disease type can be accurately predicted through the dual important features according to the importance of the first modality feature map and the region importance of the first modality feature map, and the accuracy of the disease type is improved.

It should be noted that, by generating a first disease prediction graph according to the first importance coefficient and the second importance coefficient of each first modality feature graph, the first disease prediction graph characterizes a disease expression graph in the first modality, and in a special emergency, the modality feature graph can be viewed through the method to predict whether an eye has a disease or not, so that the method makes full preparation for subsequent eye treatment. In fact, the first modal characteristic diagram or the second modal characteristic diagram is subjected to visualization processing, so that preliminary screening judgment is made for finally confirming the type of the eye disease, and a large amount of time is saved for follow-up.

In one embodiment, a first importance coefficient and a second importance coefficient of each of the M second modal characteristic maps are obtained, the first importance coefficient characterizing importance of the corresponding second modal characteristic map, and the second importance coefficient characterizing area importance of the corresponding second modal characteristic map. The disease category can be accurately predicted according to the importance of the second modality feature map and the dual importance features of the area importance of the second modality feature map, and the accuracy of the disease category is improved.

It should be noted that, by generating a second disease prediction graph according to the first importance coefficient and the second importance coefficient of each second modality feature graph, the second disease prediction graph characterizes a disease expression graph in the first modality, and in a special emergency, the modality feature graph can be viewed through the method to predict whether an eye has a disease or not, so as to make sufficient preparation for subsequent eye treatment. In fact, the first modal characteristic diagram or the second modal characteristic diagram is subjected to visualization processing, so that preliminary screening judgment is made for finally confirming the type of the eye disease, and a large amount of time is saved for follow-up.

and obtaining an average value of H first initial weights corresponding to the first modal feature map to obtain a first importance coefficient of the first modal feature map, wherein the H first initial weights are obtained in the process of performing H times of feature fusion on the N first modal feature maps and the M second modal feature maps. And averaging Pooling (Mean Pooling) is an averaging operation, averaging and homogenizing are carried out to average the fused feature map results output by the MM-MIL modules, and whether the feature map contains diseases or not is obtained by comparing the average value with a preset score.

In one embodiment, the N sub-images of the first modality are derived from the first modality image, including: and cutting the first modality image to obtain N1 first modality sub-images, respectively turning over the N1 first modality sub-images to obtain N2 first modality sub-images, wherein the sum of N1 and N2 is N.

Optionally, the N pairs of first modality sub-images are obtained by performing pseudo-oversampling on the first modality image, and the first modality image is cropped to obtain N1 first modality sub-images, where four corners and a center of the first modality image need to be cropped to obtain N1 first modality sub-images.

In some embodiments, the obtaining N first modality feature maps and M second modality feature maps includes:

and performing feature extraction on the N pairs of first modal sub-images through a first feature extraction model to obtain the N first modal feature maps and a second importance coefficient of each first modal feature map.

In another embodiment, N pairs of first modality sub-images are subjected to feature extraction, M pairs of second modality sub-images are subjected to feature extraction, the extracted feature maps are serialized respectively, and the serialized first modality feature maps and second modality feature maps are sent to two different 2D-CNNs (two-dimensional convolutional neural networks) in parallel respectively, wherein the extracted feature maps have a size w × h × D, the feature maps of the N second modality feature sequences have sizes N × w × h × D and M × w × h × D (D is 2048, h is w is 8), and N and M respectively represent 12, 12 represents the number of images. ResNet-50 (residual neural network) is used in this scheme. Alternatively, other feature extraction models may be used.

Further, the N first modal feature maps and the M second modal feature maps are subjected to spatial global average pooling respectively, so that feature vectors of required self dimensions are obtained by the first modal feature maps and the second modal feature maps, the first modal feature maps and the second modal feature maps are fused through the MM-MIL module, and the modal feature maps to be identified are obtained, so that the N first modal feature maps and the M second modal feature maps are fused and complemented with each other to obtain information, and the accuracy of disease category judgment is further increased.

In an embodiment, the classifying the H fusion modality feature maps to be identified according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories includes: and if the corresponding probability value is greater than 0.5, determining the disease type corresponding to the corresponding probability value as the target type.

Preferably, before obtaining the probability of each modal feature, each modal feature is required to be converted into a corresponding weight size through a linear layer, and the output of the method is sequentially subjected to average pooling and sigmoid (S-shaped function) to finally calculate the average probability of the disease category, if the probability is more than 0.5, then the eyes are confirmed to have diseases, and the disease category corresponding to the corresponding modal characteristics is determined as the target disease category, actually, the preset different disease categories having the corresponding modal characteristics need to be compared with the modal characteristics of which the probabilities of the corresponding modal characteristics are calculated, comparing the preset modal characteristics with the modal characteristics of the disease species to be confirmed, and the modal characteristics with the same weight or close weights are compared, so that the comparison time is shortened, and the efficiency of identifying the disease category is improved.

The present application also provides an apparatus for image-based disease identification, the apparatus comprising:

The image acquisition module, the image fusion module and the image classification module are used for processing and finally fusing the images in two modes, so that automatic detection of disease types can be realized in at least two modes, manual detection is avoided, the accuracy of image disease type detection is improved, the efficiency of disease type detection is improved, and meanwhile, the manual selection of the pathological changes and pathological change types of the images and the manual labeling images is also avoided.

It should be noted that the image obtaining module includes obtaining N first modality feature maps and N second modality feature maps, and sending the obtained N first modality feature maps and N second modality feature maps respectively in sequence and in parallel to two different 2D-CNN networks, where the sizes of the extracted N first modality feature maps and N second modality feature maps are fixed, and the sizes of the N first modality feature maps and N second modality feature maps of each sequence are related to the number of modality feature maps. And then performing space global average pooling on the N first modal characteristic diagrams and the N second modal characteristic diagrams to obtain modal characteristic diagram vectorization of the N first modal characteristic diagrams and the N second modal characteristic diagrams.

The image fusion module comprises two parts, wherein the first part is a cross-modal mapping part, the second part is a weight calculation part, the first part receives N vectorized first modal feature maps and N second modal feature maps, respectively receives the input (N multiplied by d) of two modalities, and respectively standardizes the two modal features through a full connection layer and a full connection layer so as to remove information of different modalities between the two modal features, so that the left information between the two modal features can be compared, and further the information between the two modal features can be clearly compared. And the second part splices the two modal characteristic graphs output by the first part to obtain a characteristic with the size of 2N x d, obtains the corresponding weight of each characteristic graph after the two linear layers and the activation function layers are processed twice to obtain the characteristic with the size of 2N x 1, and finally multiplies the corresponding weight of each modal characteristic graph by the N first modal characteristic graphs and the N second modal characteristic graphs to obtain the characteristic with the size of 1 x d through calculation.

Optionally, the fusion module is an MM-MIL module, which is a new module for fusion of at least two modal feature maps, which inherits the interpretability of MIL based on example attention, and fuses example-level cfp (color function picture)/oct (optical coherence tomography) modal features with example attention weights into example-level feature vectors, which are expressed as example attention weights herein. The method comprises the steps of firstly sending outputs of SW-GAPs (spatial global average pooling) of at least two modes into h MM-MIL modules, aggregating 2n example-level features (2n × d) into 1 feature (1 × d) with multi-mode information in the MM-MIL modules, converting each feature into a category decision score (1 × m) through a linear layer to serve as an output, averaging the output category decision score (1 × m), and obtaining a final probability prediction score through sigmoid (which can be called an S-type function) activation, namely the disease category probability value, wherein the averaging pooling is an averaging operation, and for the model, the output results of the MM-MIL modules are averaged. If the mean score for a class is greater than 0.5, the model considers the eye to contain the disease. A plurality of MM-MIL modules are used, the learned feature emphasis of each module is different, and the model can be learned more comprehensively and the decision can be made more accurately by combining the information of each module.

Further, the feature of the MM-MIL module needs to be calculated by averaging the attention weights calculated in the h MM-MILs, i.e. the h attention scores of each example, to obtain the attention weight of each example. From the attention weights, we can see how much each example contributes to the overall model, i.e., how much each example influences the decision of the model.

Specifically, the calculation method can be divided into two steps: the first step is to multiply the example attention score (2n × 1) element by element with the modality mapped feature (2n × d) to obtain a feature of size 2n × d.

The second step adds the features of the first step in a first dimension (length 2n) to obtain features of size 1 × d as the output of the module. Since these two steps are effectively equivalent to a matrix multiplication, only one multiplier sign is drawn on the block diagram.

The image classification module compares the modal characteristics with the modal characteristics of the disease to be confirmed at present through the preset modal characteristics, so that the disease category is confirmed, and the modal characteristics with the same weight or close weight are often compared, so that the comparison time is shortened, and the disease category confirming efficiency is improved.

And the disease category module is used for representing the possibility of each category of diseases in the preset categories by obtaining the probability value corresponding to each disease category in the preset disease categories. The H fusion modality feature maps to be recognized need to be classified, wherein the fusion modality feature maps to be recognized need to be averaged, the average is compared with a preset probability, if the average is greater than the preset probability, the eye image is determined to have a disease, and the type of the disease is determined according to the modality feature maps, wherein the preset probability is 0.5. In fact, the weight probabilities of the modal features need to be pooled averagely, then the weight probabilities of the averaged modal feature map are obtained through sigmoid (S-type function) activation, and the weight average probabilities of the modal feature map are compared with the preset probabilities.

It is to be understood that the above-described embodiments of the present application are merely illustrative of or illustrative of the principles of the present application and are not to be construed as limiting the present application. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present application shall be included in the protection scope of the present application. Further, it is intended that the appended claims cover all such changes and modifications that fall within the scope and range of equivalents of the appended claims, or the equivalents of such scope and range.

Claims

1. An image-based disease identification method, the method comprising:

2. The image-based disease identification method according to claim 1, wherein the feature fusing the N first modality feature maps and the M second modality feature maps for any fused modality feature map to be identified comprises:

3. The image-based disease recognition method according to claim 2, wherein fusing the N third modality feature maps and the M fourth modality feature maps according to the first initial weight and the second initial weight to obtain a fused modality feature map to be recognized, includes:

4. The image-based disease recognition method of claim 1, wherein the method comprises:

5. The image-based disease identification method of claim 1, further comprising:

6. The image-based disease identification method according to any one of claims 2 to 4, wherein the obtaining of the first importance coefficient of the corresponding first modality feature map for any one first modality feature map comprises:

7. The image-based disease recognition method of claim 1, wherein the N sub-images of the first modality are derived from the first modality image, comprising:

8. An image-based disease recognition method according to claims 1-5, wherein the acquiring N first modality feature maps and M second modality feature maps comprises:

9. The image-based disease recognition method according to claim 1, wherein the classifying the H fusion modality feature maps to be recognized according to preset disease categories to obtain probability values corresponding to each disease category in the preset disease categories comprises:

10. An image-based disease recognition apparatus, the apparatus comprising: