CN113793326A - Disease identification method and device based on image - Google Patents

Disease identification method and device based on image Download PDF

Info

Publication number
CN113793326A
CN113793326A CN202111109909.7A CN202111109909A CN113793326A CN 113793326 A CN113793326 A CN 113793326A CN 202111109909 A CN202111109909 A CN 202111109909A CN 113793326 A CN113793326 A CN 113793326A
Authority
CN
China
Prior art keywords
modal
feature maps
modality
disease
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111109909.7A
Other languages
Chinese (zh)
Inventor
赵建春
周阳
林海澜
丁大勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Vistel Technology Co ltd
Original Assignee
Beijing Vistel Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Vistel Technology Co ltd filed Critical Beijing Vistel Technology Co ltd
Priority to CN202111109909.7A priority Critical patent/CN113793326A/en
Publication of CN113793326A publication Critical patent/CN113793326A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10101Optical tomography; Optical coherence tomography [OCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The embodiment of the application discloses a disease identification method and device based on images, wherein the method comprises the following steps: the N first modal feature maps and the M second modal feature maps are subjected to H times of feature fusion respectively to obtain H fusion modal feature maps to be identified, the H fusion modal feature maps to be identified are classified respectively according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories, and the probability value represents the possibility of each category of diseases in the preset categories. Therefore, at least two modal characteristic images are fused, automatic detection is carried out on the disease category according to the fused modal characteristic images, manual detection is avoided, the accuracy of image disease category detection is improved, namely, the disease category detection efficiency is improved, and meanwhile, the condition and the disease category of the image and the manually marked image are prevented from being manually selected.

Description

Disease identification method and device based on image
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for identifying a disease based on an image.
Background
At present, the fundus disease types are classified based on fundus color photographs and OCT images in clinical medicine, and the classification means mainly comprises manual labeling, namely, the classification means is used for determining the disease types contained in the fundus color photographs and the OCT images by performing manual labeling on the photographed images and then judging the disease types.
The existing multi-mode automatic detection of fundus diseases comprises fundus color photography and OCT image processing, but only one OCT image is input and needs to be selected manually, the OCT image is often a sequence, a lot of information of an OCT mode can be lost by using only one OCT image, and the obtained OCT image information is marked by manpower, so that the existing data cannot be fully utilized, and the detection accuracy is not high.
Disclosure of Invention
In view of the above technical problems, embodiments of the present application provide a method and an apparatus for identifying a disease based on an image, so as to solve the problem that the existing data cannot be fully utilized by the above method, so that the detection accuracy is not high.
A first aspect of an embodiment of the present application provides an image-based disease identification method, including:
acquiring N first modal feature maps and M second modal feature maps, wherein N and M are integers which are more than or equal to 1, the N first modal feature maps are respectively feature maps of N pairs of first modal sub-images, and the N pairs of first modal sub-images are obtained according to the first modal images;
respectively performing H times of feature fusion on the N first modal feature maps and the M second modal feature maps to obtain H fusion modal feature maps to be identified, wherein H is an integer greater than or equal to 2, the H fusion modal feature maps to be identified all contain the first modal feature and the second modal feature, and weights of the same modal feature in the H fusion modal feature maps to be identified are different;
classifying the H fusion modal characteristic graphs to be identified according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories, wherein the probability value represents the possibility of each category of diseases in the preset categories.
In one embodiment, the performing feature fusion on the N first modality feature maps and the M second modality feature maps for any fused modality feature map to be identified includes:
respectively converting the N first modal feature maps into the N third modal feature maps, and respectively converting the M second modal feature maps into the M fourth modal feature maps, wherein the N third modal feature maps and the M fourth modal feature maps are comparable in modality;
configuring first initial weights for the N third modal characteristic diagrams respectively, and configuring second initial weights for the M fourth modal characteristic diagrams respectively, wherein each first initial weight represents the importance of the corresponding third modal characteristic diagram, and each second initial weight represents the importance of the corresponding fourth modal characteristic diagram;
and fusing the N third modal characteristic graphs and the M fourth modal characteristic graphs according to the first initial weight and the second initial weight to obtain a fused modal characteristic graph to be identified.
In an embodiment, the fusing the N third modality feature maps and the M fourth modality feature maps according to the first initial weight and the second initial weight to obtain a fused modality feature map to be identified includes:
respectively calculating the multiplication result of the N third modal characteristic graphs and the corresponding first initial weights, and respectively calculating the multiplication result of the M fourth modal characteristic graphs and the corresponding second initial weights;
and adding all multiplied results to obtain the fusion modal characteristic diagram to be identified.
In one embodiment, the method comprises:
respectively obtaining a first importance coefficient and a second importance coefficient of each first modal feature map in the N first modal feature maps, wherein the first importance coefficient represents the importance of the corresponding first modal feature map, and the second importance coefficient represents the region importance of the corresponding first modal feature map;
and generating a first disease prediction graph according to the first importance coefficient and the second importance coefficient of each first modality feature graph, wherein the first disease prediction graph represents a disease expression graph under a first modality.
In one embodiment, the method further comprises:
respectively obtaining a first importance coefficient and a second importance coefficient of each second modal feature map in the M second modal feature maps, wherein the first importance coefficient represents the importance of the corresponding second modal feature map, and the second importance coefficient represents the region importance of the corresponding second modal feature map;
and generating a second disease prediction graph according to the first importance coefficient and the second importance coefficient of each second modality feature graph, wherein the second disease prediction graph represents a disease expression graph under a second modality.
In one embodiment, for any first modality signature, obtaining a first importance coefficient of the corresponding first modality signature comprises:
and obtaining an average value of H first initial weights corresponding to the first modal feature map to obtain a first importance coefficient of the first modal feature map, wherein the H first initial weights are obtained in the process of performing H times of feature fusion on the N first modal feature maps and the M second modal feature maps.
In one embodiment, the N sub-images of the first modality are derived from the first modality image, including:
and cutting the first modality image to obtain N1 first modality sub-images, respectively turning over the N1 first modality sub-images to obtain N2 first modality sub-images, wherein the sum of N1 and N2 is N.
In one embodiment, the acquiring N first modality feature maps and M second modality feature maps includes:
performing feature extraction on the N pairs of first modal sub-images through a first feature extraction model to obtain the N first modal feature maps and a second importance coefficient of each first modal feature map;
and performing feature extraction on the M pairs of second modal sub-images through a second feature extraction model to obtain the M second modal feature maps and a second importance coefficient of each second modal feature map.
In an embodiment, the classifying the H fusion modality feature maps to be identified according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories includes:
and if the corresponding probability value is greater than 0.5, determining the disease type corresponding to the corresponding probability value as the target type.
A second aspect of an embodiment of the present application provides an image-based disease recognition apparatus, including:
the image acquisition module is used for acquiring N first modal feature maps and M second modal feature maps, wherein N and M are integers which are more than or equal to 1, the N first modal feature maps are respectively feature maps of N pairs of first modal sub-images, and the N pairs of first modal sub-images are obtained according to the first modal images;
the image fusion module is used for respectively performing H-time feature fusion on the N first modal feature maps and the M second modal feature maps to obtain H fusion modal feature maps to be identified, wherein H is an integer greater than or equal to 2, the H fusion modal feature maps to be identified all contain the first modal feature and the second modal feature, and weights of the same modal feature in the H fusion modal feature maps to be identified are different;
and the image classification module is used for classifying the H fusion modality feature maps to be identified respectively according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories, and the probability value represents the possibility of each category of diseases in the preset categories.
In the embodiment of the application, H times of feature fusion are respectively carried out on the N first modal feature maps and the M second modal feature maps to obtain H fused modal feature maps to be recognized, the H fused modal feature maps to be recognized all contain the first modal feature and the second modal feature, and the same modal feature has different weights in the H fused modal feature maps to be recognized, so that at least two modal feature images are fused, the disease category is automatically detected according to the fused modal feature images, manual detection is avoided, the H fused modal feature maps to be recognized are respectively classified according to preset disease categories to obtain the probability value corresponding to each disease category in the preset disease categories, and the probability value represents the possibility of diseases in each category in the preset categories, the image disease category detection accuracy is improved, the disease category detection efficiency is improved, and meanwhile, the condition and the disease category of the image selected manually and the image marked manually are avoided.
Drawings
The features and advantages of the present application will be more clearly understood by reference to the accompanying drawings, which are illustrative and not to be construed as limiting the present application in any way, and in which:
FIG. 1 is a schematic flow chart of an image-based disease identification method provided by the present application;
fig. 2 is a schematic diagram illustrating a fusion flow of specific two modality feature maps of the image-based disease identification method provided by the present application.
Detailed Description
In the following description, numerous specific details of the present application are set forth by way of examples in order to provide a thorough understanding of the relevant disclosure. It will be apparent, however, to one skilled in the art that the present application may be practiced without these specific details. It should be understood that the use of the terms "system," "apparatus," "unit" and/or "module" herein is a method for distinguishing between different components, elements, portions or assemblies at different levels of sequential arrangement. However, these terms may be replaced by other expressions if they can achieve the same purpose.
It will be understood that when a device, unit or module is referred to as being "on" … … "," connected to "or" coupled to "another device, unit or module, it can be directly on, connected or coupled to or in communication with the other device, unit or module, or intervening devices, units or modules may be present, unless the context clearly dictates otherwise. For example, as used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application. As used in the specification and claims of this application, the terms "a", "an", and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" are intended to cover only the explicitly identified features, integers, steps, operations, elements, and/or components, but not to constitute an exclusive list of such features, integers, steps, operations, elements, and/or components.
These and other features and characteristics of the present application, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, will be better understood upon consideration of the following description and the accompanying drawings, which form a part of this specification. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the application. It will be understood that the figures are not drawn to scale.
The application describes an image-based disease identification method, as shown in fig. 1, comprising:
step S10 obtains N first modality feature maps and M second modality feature maps, where N and M are both integers greater than or equal to 1, where the N first modality feature maps are feature maps of N pairs of first modality sub-images, respectively, and the N pairs of first modality sub-images are obtained according to the first modality image.
Step S20 performs H times of feature fusion on the N first modal feature maps and the M second modal feature maps, respectively, to obtain H fusion modal feature maps to be identified, where H is an integer greater than or equal to 2, each of the H fusion modal feature maps to be identified includes the first modal feature and the second modal feature, and weights of the same modal feature in the H fusion modal feature maps to be identified are different from each other.
Step S30 is to classify the H fusion modality feature maps to be identified according to preset disease categories, to obtain a probability value corresponding to each disease category in the preset disease categories, where the probability value represents the possibility of each category of disease in the preset category.
Therefore, by adopting the implementation mode, at least two modal characteristic images are fused, the disease category is automatically detected according to the fused modal characteristic images, manual detection is avoided, the accuracy of image disease category detection is improved, namely, the efficiency of disease category detection is improved, and meanwhile, the condition and the disease category of the image and the manually marked image are prevented from being manually selected.
The N first modal feature maps and the M second modal feature maps are obtained through the step S10, wherein N and M are integers larger than or equal to 1, the number of the first modal feature maps is the same as that of the second modal feature maps, and the N subsequent first modal feature maps and the M subsequent second modal feature maps are more uniformly fused due to the feature maps with the same number, so that the obtained fused features are more obvious, and the accuracy of disease identification is improved. Of course, in other embodiments, the number of the first modality feature maps and the number of the second modality feature maps may be different.
Step S20 performs H feature fusion on the N first modal feature maps and the M second modal feature maps, respectively, to obtain H fusion modal feature maps to be identified, thereby implementing fusion between two different modal features, implementing information complementation between the two modal feature maps through fusion of the two different modal feature maps, and being more rapid and beneficial to subsequent identification of disease species, and enabling fusion between the two modal feature maps to be more sufficient.
The probability value corresponding to each disease category in the preset disease categories is obtained through the step S30, the probability value represents the possibility of each category disease in the preset categories, and finally the disease category is determined, so that manual detection is avoided, the accuracy of image disease category detection is improved, the disease category detection efficiency is improved, and meanwhile, the condition and the disease category of the image selected manually and the image marked manually are avoided.
Referring to fig. 1 and fig. 2, in some embodiments, the feature fusing the N first modality feature maps and the M second modality feature maps for any fused modality feature map to be identified includes: respectively converting the N first modal feature maps into the N third modal feature maps, and respectively converting the M second modal feature maps into the M fourth modal feature maps, wherein the N third modal feature maps and the M fourth modal feature maps are comparable in modality.
Specifically, the application is a new module for multi-modal feature fusion by completing image fusion and classification through an MM-MIL module, wherein the MM-MIL module inherits interpretability of MIL based on example attention, and fuses example-level CFP/OCT features and example attention weights into example-level feature vectors, namely the N first-modal feature maps and the M second-modal feature maps are subjected to feature fusion to form vector feature maps, and the MM-MIL module is divided into two parts: the first part is a cross-modality mapping part. The cross-modality mapping part receives input (n × d) of two modalities, namely input (n × d) of first modality feature maps and input (n × d) of second modality feature maps, and the input (n × d) of the first modality feature maps and the input (n × d) of the second modality feature maps of the two modalities are respectively standardized through a full connection layer and a full connection layer to remove modality information of different modalities, so that the modalities are comparable, and further fusion of the two modality maps is realized.
And respectively configuring first initial weights for the N third modal characteristic graphs, and respectively configuring second initial weights for the M fourth modal characteristic graphs, wherein each first initial weight represents the importance of the corresponding third modal characteristic graph, and each second initial weight represents the importance of the corresponding fourth modal characteristic graph. And fusing the N third modal characteristic graphs and the M fourth modal characteristic graphs according to the first initial weight and the second initial weight to obtain a fused modal characteristic graph to be identified.
Specifically, the second part of the MM-MIL module is an example attention calculation part, the example attention calculation part splices two features output by the first part to obtain a feature with a size of 2n × d, a weight of each example is obtained after two linear layers and activation function layers, the size of the weight is 2n × 1, and the obtained weight of each example is multiplied by each original example feature to obtain a module output with a final size of 1 × d. By using a plurality of MM-MIL modules, the feature emphasis points learned by each module are different, namely the features obtained by the fusion modal feature map to be recognized are different, the model can be learned more comprehensively by combining the information of each module, the decision is more accurate, and the obtained disease type is more accurate.
In an embodiment, fusing the N third modality feature maps and the M fourth modality feature maps according to the first initial weight and the second initial weight to obtain a fused modality feature map to be identified, including: and respectively calculating the result of multiplying the N third modal characteristic maps by the corresponding first initial weights, and respectively calculating the result of multiplying the M fourth modal characteristic maps by the corresponding second initial weights. And adding all multiplied results to obtain the fusion modal characteristic diagram to be identified.
Specifically, the MM-MIL module calculates the result of multiplying the N third modal feature maps by the corresponding first initial weights, and calculates the result of multiplying the M fourth modal feature maps by the corresponding second initial weights, so as to obtain the fused modal feature map to be identified, which is further divided into two steps: firstly, performing element-by-element multiplication on modal features (2n × d) obtained by an example attention score calculating part and features (2n × 1) obtained by performing two times of linear layers and activation function layers to obtain the weight of each example, and obtaining features with the size of 2n × d; the second step adds the features of the first step in a first dimension (length 2n) to obtain features of size 1 × d as the output of the module.
In one embodiment, the method comprises: respectively obtaining a first importance coefficient and a second importance coefficient of each first modal characteristic diagram in the N first modal characteristic diagrams, wherein the first importance coefficient represents the importance of the corresponding first modal characteristic diagram, and the second importance coefficient represents the region importance of the corresponding first modal characteristic diagram. The disease type can be accurately predicted through the dual important features according to the importance of the first modality feature map and the region importance of the first modality feature map, and the accuracy of the disease type is improved.
And generating a first disease prediction graph according to the first importance coefficient and the second importance coefficient of each first modality feature graph, wherein the first disease prediction graph represents a disease expression graph under a first modality.
It should be noted that, by generating a first disease prediction graph according to the first importance coefficient and the second importance coefficient of each first modality feature graph, the first disease prediction graph characterizes a disease expression graph in the first modality, and in a special emergency, the modality feature graph can be viewed through the method to predict whether an eye has a disease or not, so that the method makes full preparation for subsequent eye treatment. In fact, the first modal characteristic diagram or the second modal characteristic diagram is subjected to visualization processing, so that preliminary screening judgment is made for finally confirming the type of the eye disease, and a large amount of time is saved for follow-up.
In one embodiment, a first importance coefficient and a second importance coefficient of each of the M second modal characteristic maps are obtained, the first importance coefficient characterizing importance of the corresponding second modal characteristic map, and the second importance coefficient characterizing area importance of the corresponding second modal characteristic map. The disease category can be accurately predicted according to the importance of the second modality feature map and the dual importance features of the area importance of the second modality feature map, and the accuracy of the disease category is improved.
And generating a second disease prediction graph according to the first importance coefficient and the second importance coefficient of each second modality feature graph, wherein the second disease prediction graph represents a disease expression graph under a second modality.
It should be noted that, by generating a second disease prediction graph according to the first importance coefficient and the second importance coefficient of each second modality feature graph, the second disease prediction graph characterizes a disease expression graph in the first modality, and in a special emergency, the modality feature graph can be viewed through the method to predict whether an eye has a disease or not, so as to make sufficient preparation for subsequent eye treatment. In fact, the first modal characteristic diagram or the second modal characteristic diagram is subjected to visualization processing, so that preliminary screening judgment is made for finally confirming the type of the eye disease, and a large amount of time is saved for follow-up.
In one embodiment, for any first modality signature, obtaining a first importance coefficient of the corresponding first modality signature comprises:
and obtaining an average value of H first initial weights corresponding to the first modal feature map to obtain a first importance coefficient of the first modal feature map, wherein the H first initial weights are obtained in the process of performing H times of feature fusion on the N first modal feature maps and the M second modal feature maps. And averaging Pooling (Mean Pooling) is an averaging operation, averaging and homogenizing are carried out to average the fused feature map results output by the MM-MIL modules, and whether the feature map contains diseases or not is obtained by comparing the average value with a preset score.
In one embodiment, the N sub-images of the first modality are derived from the first modality image, including: and cutting the first modality image to obtain N1 first modality sub-images, respectively turning over the N1 first modality sub-images to obtain N2 first modality sub-images, wherein the sum of N1 and N2 is N.
Optionally, the N pairs of first modality sub-images are obtained by performing pseudo-oversampling on the first modality image, and the first modality image is cropped to obtain N1 first modality sub-images, where four corners and a center of the first modality image need to be cropped to obtain N1 first modality sub-images.
In some embodiments, the obtaining N first modality feature maps and M second modality feature maps includes:
and performing feature extraction on the N pairs of first modal sub-images through a first feature extraction model to obtain the N first modal feature maps and a second importance coefficient of each first modal feature map.
And performing feature extraction on the M pairs of second modal sub-images through a second feature extraction model to obtain the M second modal feature maps and a second importance coefficient of each second modal feature map.
In another embodiment, N pairs of first modality sub-images are subjected to feature extraction, M pairs of second modality sub-images are subjected to feature extraction, the extracted feature maps are serialized respectively, and the serialized first modality feature maps and second modality feature maps are sent to two different 2D-CNNs (two-dimensional convolutional neural networks) in parallel respectively, wherein the extracted feature maps have a size w × h × D, the feature maps of the N second modality feature sequences have sizes N × w × h × D and M × w × h × D (D is 2048, h is w is 8), and N and M respectively represent 12, 12 represents the number of images. ResNet-50 (residual neural network) is used in this scheme. Alternatively, other feature extraction models may be used.
Further, the N first modal feature maps and the M second modal feature maps are subjected to spatial global average pooling respectively, so that feature vectors of required self dimensions are obtained by the first modal feature maps and the second modal feature maps, the first modal feature maps and the second modal feature maps are fused through the MM-MIL module, and the modal feature maps to be identified are obtained, so that the N first modal feature maps and the M second modal feature maps are fused and complemented with each other to obtain information, and the accuracy of disease category judgment is further increased.
In an embodiment, the classifying the H fusion modality feature maps to be identified according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories includes: and if the corresponding probability value is greater than 0.5, determining the disease type corresponding to the corresponding probability value as the target type.
Preferably, before obtaining the probability of each modal feature, each modal feature is required to be converted into a corresponding weight size through a linear layer, and the output of the method is sequentially subjected to average pooling and sigmoid (S-shaped function) to finally calculate the average probability of the disease category, if the probability is more than 0.5, then the eyes are confirmed to have diseases, and the disease category corresponding to the corresponding modal characteristics is determined as the target disease category, actually, the preset different disease categories having the corresponding modal characteristics need to be compared with the modal characteristics of which the probabilities of the corresponding modal characteristics are calculated, comparing the preset modal characteristics with the modal characteristics of the disease species to be confirmed, and the modal characteristics with the same weight or close weights are compared, so that the comparison time is shortened, and the efficiency of identifying the disease category is improved.
The present application also provides an apparatus for image-based disease identification, the apparatus comprising:
the image acquisition module is used for acquiring N first modal feature maps and M second modal feature maps, wherein N and M are integers which are more than or equal to 1, the N first modal feature maps are respectively feature maps of N pairs of first modal sub-images, and the N pairs of first modal sub-images are obtained according to the first modal images;
the image fusion module is used for respectively performing H-time feature fusion on the N first modal feature maps and the M second modal feature maps to obtain H fusion modal feature maps to be identified, wherein H is an integer greater than or equal to 2, the H fusion modal feature maps to be identified all contain the first modal feature and the second modal feature, and weights of the same modal feature in the H fusion modal feature maps to be identified are different;
and the image classification module is used for classifying the H fusion modality feature maps to be identified respectively according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories, and the probability value represents the possibility of each category of diseases in the preset categories.
The image acquisition module, the image fusion module and the image classification module are used for processing and finally fusing the images in two modes, so that automatic detection of disease types can be realized in at least two modes, manual detection is avoided, the accuracy of image disease type detection is improved, the efficiency of disease type detection is improved, and meanwhile, the manual selection of the pathological changes and pathological change types of the images and the manual labeling images is also avoided.
It should be noted that the image obtaining module includes obtaining N first modality feature maps and N second modality feature maps, and sending the obtained N first modality feature maps and N second modality feature maps respectively in sequence and in parallel to two different 2D-CNN networks, where the sizes of the extracted N first modality feature maps and N second modality feature maps are fixed, and the sizes of the N first modality feature maps and N second modality feature maps of each sequence are related to the number of modality feature maps. And then performing space global average pooling on the N first modal characteristic diagrams and the N second modal characteristic diagrams to obtain modal characteristic diagram vectorization of the N first modal characteristic diagrams and the N second modal characteristic diagrams.
The image fusion module comprises two parts, wherein the first part is a cross-modal mapping part, the second part is a weight calculation part, the first part receives N vectorized first modal feature maps and N second modal feature maps, respectively receives the input (N multiplied by d) of two modalities, and respectively standardizes the two modal features through a full connection layer and a full connection layer so as to remove information of different modalities between the two modal features, so that the left information between the two modal features can be compared, and further the information between the two modal features can be clearly compared. And the second part splices the two modal characteristic graphs output by the first part to obtain a characteristic with the size of 2N x d, obtains the corresponding weight of each characteristic graph after the two linear layers and the activation function layers are processed twice to obtain the characteristic with the size of 2N x 1, and finally multiplies the corresponding weight of each modal characteristic graph by the N first modal characteristic graphs and the N second modal characteristic graphs to obtain the characteristic with the size of 1 x d through calculation.
Optionally, the fusion module is an MM-MIL module, which is a new module for fusion of at least two modal feature maps, which inherits the interpretability of MIL based on example attention, and fuses example-level cfp (color function picture)/oct (optical coherence tomography) modal features with example attention weights into example-level feature vectors, which are expressed as example attention weights herein. The method comprises the steps of firstly sending outputs of SW-GAPs (spatial global average pooling) of at least two modes into h MM-MIL modules, aggregating 2n example-level features (2n × d) into 1 feature (1 × d) with multi-mode information in the MM-MIL modules, converting each feature into a category decision score (1 × m) through a linear layer to serve as an output, averaging the output category decision score (1 × m), and obtaining a final probability prediction score through sigmoid (which can be called an S-type function) activation, namely the disease category probability value, wherein the averaging pooling is an averaging operation, and for the model, the output results of the MM-MIL modules are averaged. If the mean score for a class is greater than 0.5, the model considers the eye to contain the disease. A plurality of MM-MIL modules are used, the learned feature emphasis of each module is different, and the model can be learned more comprehensively and the decision can be made more accurately by combining the information of each module.
Further, the feature of the MM-MIL module needs to be calculated by averaging the attention weights calculated in the h MM-MILs, i.e. the h attention scores of each example, to obtain the attention weight of each example. From the attention weights, we can see how much each example contributes to the overall model, i.e., how much each example influences the decision of the model.
Specifically, the calculation method can be divided into two steps: the first step is to multiply the example attention score (2n × 1) element by element with the modality mapped feature (2n × d) to obtain a feature of size 2n × d.
The second step adds the features of the first step in a first dimension (length 2n) to obtain features of size 1 × d as the output of the module. Since these two steps are effectively equivalent to a matrix multiplication, only one multiplier sign is drawn on the block diagram.
The image classification module compares the modal characteristics with the modal characteristics of the disease to be confirmed at present through the preset modal characteristics, so that the disease category is confirmed, and the modal characteristics with the same weight or close weight are often compared, so that the comparison time is shortened, and the disease category confirming efficiency is improved.
And the disease category module is used for representing the possibility of each category of diseases in the preset categories by obtaining the probability value corresponding to each disease category in the preset disease categories. The H fusion modality feature maps to be recognized need to be classified, wherein the fusion modality feature maps to be recognized need to be averaged, the average is compared with a preset probability, if the average is greater than the preset probability, the eye image is determined to have a disease, and the type of the disease is determined according to the modality feature maps, wherein the preset probability is 0.5. In fact, the weight probabilities of the modal features need to be pooled averagely, then the weight probabilities of the averaged modal feature map are obtained through sigmoid (S-type function) activation, and the weight average probabilities of the modal feature map are compared with the preset probabilities.
It is to be understood that the above-described embodiments of the present application are merely illustrative of or illustrative of the principles of the present application and are not to be construed as limiting the present application. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present application shall be included in the protection scope of the present application. Further, it is intended that the appended claims cover all such changes and modifications that fall within the scope and range of equivalents of the appended claims, or the equivalents of such scope and range.

Claims (10)

1. An image-based disease identification method, the method comprising:
acquiring N first modal feature maps and M second modal feature maps, wherein N and M are integers which are more than or equal to 1, the N first modal feature maps are respectively feature maps of N pairs of first modal sub-images, and the N pairs of first modal sub-images are obtained according to the first modal images;
respectively performing H times of feature fusion on the N first modal feature maps and the M second modal feature maps to obtain H fusion modal feature maps to be identified, wherein H is an integer greater than or equal to 2, the H fusion modal feature maps to be identified all contain the first modal feature and the second modal feature, and weights of the same modal feature in the H fusion modal feature maps to be identified are different;
classifying the H fusion modal characteristic graphs to be identified according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories, wherein the probability value represents the possibility of each category of diseases in the preset categories.
2. The image-based disease identification method according to claim 1, wherein the feature fusing the N first modality feature maps and the M second modality feature maps for any fused modality feature map to be identified comprises:
respectively converting the N first modal feature maps into the N third modal feature maps, and respectively converting the M second modal feature maps into the M fourth modal feature maps, wherein the N third modal feature maps and the M fourth modal feature maps are comparable in modality;
configuring first initial weights for the N third modal characteristic diagrams respectively, and configuring second initial weights for the M fourth modal characteristic diagrams respectively, wherein each first initial weight represents the importance of the corresponding third modal characteristic diagram, and each second initial weight represents the importance of the corresponding fourth modal characteristic diagram;
and fusing the N third modal characteristic graphs and the M fourth modal characteristic graphs according to the first initial weight and the second initial weight to obtain a fused modal characteristic graph to be identified.
3. The image-based disease recognition method according to claim 2, wherein fusing the N third modality feature maps and the M fourth modality feature maps according to the first initial weight and the second initial weight to obtain a fused modality feature map to be recognized, includes:
respectively calculating the multiplication result of the N third modal characteristic graphs and the corresponding first initial weights, and respectively calculating the multiplication result of the M fourth modal characteristic graphs and the corresponding second initial weights;
and adding all multiplied results to obtain the fusion modal characteristic diagram to be identified.
4. The image-based disease recognition method of claim 1, wherein the method comprises:
respectively obtaining a first importance coefficient and a second importance coefficient of each first modal feature map in the N first modal feature maps, wherein the first importance coefficient represents the importance of the corresponding first modal feature map, and the second importance coefficient represents the region importance of the corresponding first modal feature map;
and generating a first disease prediction graph according to the first importance coefficient and the second importance coefficient of each first modality feature graph, wherein the first disease prediction graph represents a disease expression graph under a first modality.
5. The image-based disease identification method of claim 1, further comprising:
respectively obtaining a first importance coefficient and a second importance coefficient of each second modal feature map in the M second modal feature maps, wherein the first importance coefficient represents the importance of the corresponding second modal feature map, and the second importance coefficient represents the region importance of the corresponding second modal feature map;
and generating a second disease prediction graph according to the first importance coefficient and the second importance coefficient of each second modality feature graph, wherein the second disease prediction graph represents a disease expression graph under a second modality.
6. The image-based disease identification method according to any one of claims 2 to 4, wherein the obtaining of the first importance coefficient of the corresponding first modality feature map for any one first modality feature map comprises:
and obtaining an average value of H first initial weights corresponding to the first modal feature map to obtain a first importance coefficient of the first modal feature map, wherein the H first initial weights are obtained in the process of performing H times of feature fusion on the N first modal feature maps and the M second modal feature maps.
7. The image-based disease recognition method of claim 1, wherein the N sub-images of the first modality are derived from the first modality image, comprising:
and cutting the first modality image to obtain N1 first modality sub-images, respectively turning over the N1 first modality sub-images to obtain N2 first modality sub-images, wherein the sum of N1 and N2 is N.
8. An image-based disease recognition method according to claims 1-5, wherein the acquiring N first modality feature maps and M second modality feature maps comprises:
performing feature extraction on the N pairs of first modal sub-images through a first feature extraction model to obtain the N first modal feature maps and a second importance coefficient of each first modal feature map;
and performing feature extraction on the M pairs of second modal sub-images through a second feature extraction model to obtain the M second modal feature maps and a second importance coefficient of each second modal feature map.
9. The image-based disease recognition method according to claim 1, wherein the classifying the H fusion modality feature maps to be recognized according to preset disease categories to obtain probability values corresponding to each disease category in the preset disease categories comprises:
and if the corresponding probability value is greater than 0.5, determining the disease type corresponding to the corresponding probability value as the target type.
10. An image-based disease recognition apparatus, the apparatus comprising:
the image acquisition module is used for acquiring N first modal feature maps and M second modal feature maps, wherein N and M are integers which are more than or equal to 1, the N first modal feature maps are respectively feature maps of N pairs of first modal sub-images, and the N pairs of first modal sub-images are obtained according to the first modal images;
the image fusion module is used for respectively performing H-time feature fusion on the N first modal feature maps and the M second modal feature maps to obtain H fusion modal feature maps to be identified, wherein H is an integer greater than or equal to 2, the H fusion modal feature maps to be identified all contain the first modal feature and the second modal feature, and weights of the same modal feature in the H fusion modal feature maps to be identified are different;
and the image classification module is used for classifying the H fusion modality feature maps to be identified respectively according to preset disease categories to obtain a probability value corresponding to each disease category in the preset disease categories, and the probability value represents the possibility of each category of diseases in the preset categories.
CN202111109909.7A 2021-09-18 2021-09-18 Disease identification method and device based on image Pending CN113793326A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111109909.7A CN113793326A (en) 2021-09-18 2021-09-18 Disease identification method and device based on image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111109909.7A CN113793326A (en) 2021-09-18 2021-09-18 Disease identification method and device based on image

Publications (1)

Publication Number Publication Date
CN113793326A true CN113793326A (en) 2021-12-14

Family

ID=78879087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111109909.7A Pending CN113793326A (en) 2021-09-18 2021-09-18 Disease identification method and device based on image

Country Status (1)

Country Link
CN (1) CN113793326A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114841970A (en) * 2022-05-09 2022-08-02 北京字节跳动网络技术有限公司 Inspection image recognition method and device, readable medium and electronic equipment
CN117292443A (en) * 2023-09-25 2023-12-26 杭州名光微电子科技有限公司 Multi-mode recognition system and method for fusing human face and palm vein

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685819A (en) * 2018-12-11 2019-04-26 厦门大学 A kind of three-dimensional medical image segmentation method based on feature enhancing
CN113129267A (en) * 2021-03-22 2021-07-16 杭州电子科技大学 OCT image detection method and system based on retina hierarchical data
CN113158821A (en) * 2021-03-29 2021-07-23 中国科学院深圳先进技术研究院 Multimodal eye detection data processing method and device and terminal equipment
WO2021169723A1 (en) * 2020-02-27 2021-09-02 Oppo广东移动通信有限公司 Image recognition method and apparatus, electronic device, and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685819A (en) * 2018-12-11 2019-04-26 厦门大学 A kind of three-dimensional medical image segmentation method based on feature enhancing
WO2021169723A1 (en) * 2020-02-27 2021-09-02 Oppo广东移动通信有限公司 Image recognition method and apparatus, electronic device, and storage medium
CN113129267A (en) * 2021-03-22 2021-07-16 杭州电子科技大学 OCT image detection method and system based on retina hierarchical data
CN113158821A (en) * 2021-03-29 2021-07-23 中国科学院深圳先进技术研究院 Multimodal eye detection data processing method and device and terminal equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114841970A (en) * 2022-05-09 2022-08-02 北京字节跳动网络技术有限公司 Inspection image recognition method and device, readable medium and electronic equipment
CN117292443A (en) * 2023-09-25 2023-12-26 杭州名光微电子科技有限公司 Multi-mode recognition system and method for fusing human face and palm vein
CN117292443B (en) * 2023-09-25 2024-06-07 杭州名光微电子科技有限公司 Multi-mode recognition system and method for fusing human face and palm vein

Similar Documents

Publication Publication Date Title
JP7058373B2 (en) Lesion detection and positioning methods, devices, devices, and storage media for medical images
EP3779774B1 (en) Training method for image semantic segmentation model and server
CN112949710B (en) Image clustering method and device
AU2016332947B2 (en) Semi-automatic labelling of datasets
CN108280477B (en) Method and apparatus for clustering images
KR101640998B1 (en) Image processing apparatus and image processing method
CN110264444B (en) Damage detection method and device based on weak segmentation
EP3520045A1 (en) Image-based vehicle loss assessment method, apparatus, and system, and electronic device
CN111476806B (en) Image processing method, image processing device, computer equipment and storage medium
CN113728335A (en) Method and system for classification and visualization of 3D images
CN110197716B (en) Medical image processing method and device and computer readable storage medium
CN113793326A (en) Disease identification method and device based on image
CN110738102A (en) face recognition method and system
CN110879982A (en) Crowd counting system and method
CN110689440A (en) Vehicle insurance claim settlement identification method and device based on image identification, computer equipment and storage medium
US9529935B2 (en) Efficient link management for graph clustering
CN111179270A (en) Image co-segmentation method and device based on attention mechanism
CN116457776A (en) Image processing method, device, computing equipment and medium
CN116681923A (en) Automatic ophthalmic disease classification method and system based on artificial intelligence
CN116824203A (en) Glaucoma recognition device and recognition method based on neural network
CN113052295A (en) Neural network training method, object detection method, device and equipment
CN116959099A (en) Abnormal behavior identification method based on space-time diagram convolutional neural network
CN114913369A (en) Active rescue decision-making method and device based on knowledge reasoning
CN114022698A (en) Multi-tag behavior identification method and device based on binary tree structure
CN110689112A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination