CN113469049B - Disease information identification method, system, device and storage medium - Google Patents

Disease information identification method, system, device and storage medium Download PDF

Info

Publication number
CN113469049B
CN113469049B CN202110744807.6A CN202110744807A CN113469049B CN 113469049 B CN113469049 B CN 113469049B CN 202110744807 A CN202110744807 A CN 202110744807A CN 113469049 B CN113469049 B CN 113469049B
Authority
CN
China
Prior art keywords
image
disease
identification
features
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110744807.6A
Other languages
Chinese (zh)
Other versions
CN113469049A (en
Inventor
伍世宾
周宸
陈远旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110744807.6A priority Critical patent/CN113469049B/en
Publication of CN113469049A publication Critical patent/CN113469049A/en
Application granted granted Critical
Publication of CN113469049B publication Critical patent/CN113469049B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention discloses a disease information identification method, a disease information identification system, a disease information identification device and a storage medium, wherein the disease information identification method comprises the following steps: acquiring an image to be identified, wherein the image to be identified contains disease information; extracting features according to the image to be identified to obtain image features; carrying out first disease identification according to the image characteristics to obtain a plurality of first identification results; acquiring questioning data corresponding to the first recognition result from a database according to the first recognition result; acquiring response data corresponding to the questioning data; natural language processing is carried out on the response data, and inquiry features are extracted; inputting the inquiry features and the image features into a transformers module for feature fusion to obtain fusion features; and carrying out second disease identification according to the fusion characteristics to obtain a target identification result. The invention combines the image features and the inquiry features and enriches the expressive power of the features, thereby improving the recognition accuracy and being widely applied to the field of artificial intelligence.

Description

Disease information identification method, system, device and storage medium
Technical Field
The present invention relates to the field of artificial intelligence, and in particular, to a disease information identification method, system, device and storage medium.
Background
In recent years, the fusion of the artificial intelligence technology and the medical health field is deepened continuously, along with the gradual maturation of the technology such as the artificial intelligence field, voice interaction, computer vision and cognitive computing, the application scene of the artificial intelligence is more abundant, and the artificial intelligence technology also gradually becomes an important factor for influencing the development of the medical industry and improving the medical service level.
In medical health applications, the identification of disease information is an important component, which lays a foundation for subsequent doctors to make medical decisions. By means of the learning ability of artificial intelligence, the identification efficiency of disease information can be improved, and the working efficiency of doctors is greatly improved.
The existing disease information identification technology is only based on detection and classification of single focus pictures, and lacks of combination with other information, so that identification accuracy is affected.
Disclosure of Invention
In order to solve at least one of the technical problems existing in the prior art to a certain extent, the invention aims to provide a disease information identification method, a disease information identification system, a disease information identification device and a storage medium, and aims to enrich the expressive power of the characteristics by fusing image characteristics and inquiry characteristics so as to improve the identification accuracy.
To achieve the above object, an embodiment of the present invention provides a disease information identification method, including the steps of:
acquiring an image to be identified, wherein the image to be identified contains disease information;
extracting features according to the image to be identified to obtain image features;
Carrying out first disease identification according to the image characteristics to obtain a plurality of first identification results;
Acquiring questioning data corresponding to the first recognition result from a database according to the first recognition result;
Acquiring response data corresponding to the question data;
natural language processing is carried out on the response data, and inquiry features are extracted;
inputting the inquiry feature and the image feature into transformers module for feature fusion to obtain fusion feature;
And carrying out second disease identification according to the fusion characteristics to obtain a target identification result.
In order to achieve the above object, an embodiment of the present invention further provides a disease information identification system, including:
the image data acquisition module is used for acquiring an image to be identified, wherein the image to be identified contains disease information;
the first feature extraction module is used for carrying out feature extraction according to the image to be identified to obtain image features;
the first recognition module is used for carrying out first disease recognition according to the image characteristics to obtain a plurality of first recognition results;
The questioning data acquisition module is used for acquiring questioning data corresponding to the first recognition result from a database according to the first recognition result;
the response data acquisition module is used for acquiring response data corresponding to the question data;
The second feature extraction module is used for carrying out natural language processing on the response data and extracting to obtain inquiry features;
The feature fusion module is used for carrying out feature fusion on the inquiry features and the image features input to the transformers module to obtain fusion features;
and the second recognition module is used for carrying out second disease recognition according to the fusion characteristics to obtain a target recognition result.
In order to achieve the above object, an embodiment of the present invention further provides a disease information identifying apparatus, including:
At least one processor;
At least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to carry out the steps of the aforementioned method.
To achieve the above object, an embodiment of the present invention also proposes a storage medium for computer-readable storage, the storage medium storing one or more programs executable by one or more processors to implement the steps of the foregoing method.
According to the disease information identification method, system, device and storage medium, the image features and the inquiry features extracted based on natural language processing are fused to obtain the fusion features, the expressive power of the features is enriched, and further the identification is carried out according to the fusion features, so that the identification accuracy of the disease information is greatly improved.
Drawings
Fig. 1 is a flowchart of steps of a disease information identification method according to an embodiment of the present invention;
FIG. 2 is a flowchart of the steps for capturing an image in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a first disease identification based on image features in an embodiment of the invention;
FIG. 4 is a schematic diagram of second disease identification based on fusion features in an embodiment of the invention;
FIG. 5 is a schematic diagram of acquiring disease information in combination with a first disease identification and a second disease identification in an embodiment of the invention;
FIG. 6 is a block diagram of a disease information recognition system according to an embodiment of the present invention;
Fig. 7 is a block diagram of a disease information identifying apparatus according to an embodiment of the present invention.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
In the following description, suffixes such as "module", "part" or "unit" for representing elements are used only for facilitating the description of the present invention, and have no particular meaning in themselves. Thus, "module," "component," or "unit" may be used in combination.
As shown in fig. 1, the present embodiment provides a disease information identification method, including the following steps:
step S110: and acquiring an image to be identified, wherein the image to be identified contains disease information.
Specifically, the image to be identified can be uploaded after being acquired by a patient through an image acquisition device, wherein the image acquisition device can be a mobile phone terminal, a tablet personal computer, a camera and other devices. The image refers to an image obtained by shooting a disease area of a patient, wherein the image contains disease information, and the disease information refers to symptoms of the disease on the body, such as skin diseases, and red spots, acne, ulceration inflammation and the like are generated on the skin; also such as a black nail, a block of black areas is presented on the fingernail. The patient shoots the image acquisition equipment aiming at the disease part, and the shot image is used as important identification input information, so that the shot image needs to be clear and effective, otherwise, the subsequent identification result is influenced. For example, the image is not focused accurately in the shooting process, and a fuzzy place appears, so that the subsequent recognition effect is directly influenced, for example, the shot picture is not comprehensive enough, only a local area of a disease is shot, all areas are not shot, and the recognition effect is also influenced. Therefore, in the process of capturing an image, it is necessary to strictly control the quality of the image and prompt an unsatisfactory image to require re-shooting, and referring to fig. 2, step S110 may further include:
Step S111: acquiring an image containing a focus area according to a preset guidance prompt, wherein the guidance prompt is used for prompting the position of the focus area;
step S112: detecting whether the image containing the focus area meets the acquisition requirement, if so, taking the image containing the focus area as the image to be identified, and if not, prompting to acquire the image containing the focus area again.
In step S111, the preset guiding prompt may be a text prompt, a popup prompt, a frame for adding or identifying the target acquisition area, etc., which is not limited in the embodiment of the present application.
For example, taking a preset guiding prompt as a frame (abbreviated as an area frame) of an additional target acquisition area, taking an image acquired by a mobile phone terminal as an example, when a patient opens a camera to shoot a focus area, a corresponding area frame can be displayed in a mobile phone screen, and the patient is prompted to aim the area frame at the focus area as much as possible to shoot. In addition, illumination of the global area and the local area can be monitored in real time, the conditions that illumination is too dark, insufficient light exists and a picture is too dark are avoided, whether the focal length accords with a set interval can be detected in real time, and when the focal length accords with the requirements, a shot image is prompted. After the patient shoots the image, the mobile phone can automatically upload the image to a background system, namely a system for identifying the image by disease, after the system receives the identified image, the system can firstly judge the image, judge whether the brightness, the fuzzy value and the like of the image meet the requirements, if so, the image can be identified, and if not, the information is fed back to the mobile phone terminal to prompt the patient to take a picture again. Through the limitation of the requirements, consistency of the obtained dermatological picture color, illumination, brightness, fuzzy value and the like is ensured, so that the subsequent recognition effect is improved. It should be noted here that the detailed steps listed in this embodiment are optional execution steps, and do not necessarily represent that the disease information identifying method of the present invention needs to execute all the steps.
Step S120: and extracting features according to the image to be identified to obtain image features.
In particular, a depth network with a multi-scale feature pyramid may be employed for feature extraction. There are two main types of ways to handle multiscales in visual tasks: image pyramids and feature pyramids. The feature pyramid is similar to combining multi-scale feature fusion and multi-scale prediction, and the semantic information of a high layer is gradually propagated to a low layer through upsampling and lateral connection. The specific practice is to up-sample the features of the higher layer by 2 times, change the channel number by using 1X1 convolution for the features of the lower layer, and then add the results of the two. Thus, the feature images of each layer are equivalent to the features fused with different resolutions, so that objects with corresponding resolutions are detected, and each layer is ensured to have proper resolution and strong semantic features, so that small targets can be detected, and stronger semantic information for classification is ensured. The feature map of each resolution output is sent to the subsequent process independently for target detection, and the feature maps of different sizes are selected for detection (i.e. multi-size processing) for the regions of interest (ROIs, region of interest) of different sizes, so that the subsequent recognition accuracy can be improved.
Step S130: and carrying out first disease identification according to the image characteristics to obtain a plurality of first identification results.
In step S130, a lightweight, efficient single-mode disease classification model may be used to target images, as shown in fig. 3, using a deep network with multi-scale feature pyramids as the backbone, and then connecting the attention mechanism module and the spatial pyramid pooling layer to focus on global and local features. In fig. 3, the Attention Module is an Attention mechanism Module, the SPP is a spatial pyramid pooling layer, the FC is fully connected, and the Softmax is a Softmax classifier. The attention mechanism module includes a channel attention unit (channel Attention Module) and a spatial attention unit (spatial Attention Module).
The channel attention unit (Channel Attention Module) compresses the feature map in the space dimension to obtain a one-dimensional vector and then operates. When compression is performed in the spatial dimension, not only mean pooling (Average Pooling) but also maximum pooling (Max Pooling) are considered. Average pooling and maximum pooling can be used to aggregate spatial information of feature maps, send to a shared network, compress the spatial dimensions of the input feature map, and sum and combine element-by-element to produce a channel attention map. Channel attention, singly in the case of a graph, is focused on what is important on the graph. The average pooling has feedback for every pixel on the feature map, while the maximum pooling has gradient feedback only where the response is greatest in the feature map when performing the gradient back propagation calculation.
The spatial attention unit (Spatial Attention Module) compresses the channels, and average pooling AvgPool and maximum pooling MaxPool are performed in the channel dimensions, respectively. MaxPool is to extract the maximum value on the channel, and the number of times of extraction is high multiplied by the width; avgPool is to extract the average value on the channel, and the number of times of extraction is also high multiplied by the width; the previously extracted feature maps (channel numbers are all 1) are then combined to obtain a 2-channel feature map.
Step S140: and acquiring questioning data corresponding to the first recognition result from the database according to the first recognition result.
Through the recognition in step S130, a plurality of first recognition results, such as one or more recognition results, are obtained, and when one recognition result is obtained, it represents that the recognition result is not controversial, and the subsequent processing is relatively simple, which will not be discussed in detail herein. When two or more identification results are obtained, question data corresponding to the first identification result is obtained from a database according to the identification results, wherein the database is a pre-established database, and can be used for constructing corresponding question templates for common disease types by advanced doctors to form a knowledge base, and can also be from the Internet or a question database provided by a third party.
In some embodiments, when the number of recognized results is relatively large, for example, the number of recognized results exceeds 3, the first few recognized results (for example, the first three recognized results) with relatively large probability may be obtained as final recognized results according to the probability of each recognized result, and the question data may be obtained according to the final recognized results. The acquired questioning data can be returned to terminal equipment such as a mobile phone, a tablet computer, a notebook computer and the like held by the patient for display.
The questioning data may be returned to the mobile phone terminal of the patient in text form and/or voice form, for example, the identified result is eczema, psoriasis or tinea, and the clinical characteristics and phenotype of the three diseases are obtained according to the disease association knowledge base and the corresponding questioning data is fed back.
Step S150: response data corresponding to the question data is acquired.
Specifically, after the question data is fed back to the mobile phone terminal of the patient, the question data can be displayed in a text form and/or a voice form, the patient can answer according to the content of the question data, the answer form can be text input or voice input, and response data is generated according to the input data and sent to the background system.
In an alternative embodiment, the question data may include question data of options. The following text question data is exemplified, for example question 1, including option ABCD option, or including option "yes" or "no", because after image recognition, the general etiology is already locked in a certain range (such as three recognition results), and corresponding question data, i.e. data for distinguishing the three recognition results, is obtained for the three recognition results. For example, if a question for a certain disease is answered "yes" and a question for another disease is answered "no", information about the corresponding disease can be further determined.
If effective data cannot be obtained through the question mode of the options, that is, it is difficult to further determine the distinguishing and identifying results according to the answers of the patient, for example, if the answers of the patient for all questions are 'no', the three possibly identified results do not include the true diseases suffered by the patient, and the patient is required to describe the diseases according to the question data to form question feature information, wherein the description is text description or voice description. The patient describes the shape, size, distribution, disease position, time and other conditions of the disease, generates corresponding response data, and feeds the response data back to the background system.
Step S160: and carrying out natural language processing on the response data, and extracting to obtain inquiry features.
In step S160, natural language processing (Neuro-Linguistic Programming, NLP) may be used to extract data from the answer data, and query features may be extracted, where the query features include shape description, location description, size description, color description, and the like of the disease. For example, the visual manifestations of eczema and acne are similar, the visual representation difference between lesions is small, but the essential characteristic difference exists between the two diseases, the two diseases can be obtained from inquiry data, and the recognition accuracy can be greatly improved by fusing the expressive force of the abundant characteristics of the inquiry characteristics.
Step S170: and inputting the inquiry features and the image features into a transformers module for feature fusion to obtain fusion features.
Specifically, the foregoing natural language processing technique is used to extract the inquiry features in the response data, and the image features extracted by the depth network of the feature pyramid may be input to the transformers module for feature fusion, so as to improve the expressive power of the features.
Step S180: and carrying out second disease identification according to the fusion characteristics to obtain a target identification result.
Specifically, the second disease identification in step S180 may identify and classify the fusion features using a multi-modal disease classification model, see fig. 4, including transformers encoder (transformers encoder) and class classifier (Categories classifer). transformers are used to fuse visual depth features (i.e., image features) with text depth features (i.e., interview features) to further refine the classification result. The multi-modal disease classification model can perform data element fusion and training of different modalities based on element learning. The obtained target identification result can be sent to terminal equipment such as a smart phone, a notebook computer and the like held by a patient for display, so that the patient can acquire disease information in time; the medical treatment device can also be sent to terminal equipment such as a smart phone and a notebook computer held by a doctor or sent to a desktop computer, a background server and the like which can be referred by the doctor for display, so that the doctor can know the illness state fully and formulate a corresponding treatment scheme.
In an alternative embodiment, the second recognition result obtained by the second disease recognition may be used as the target recognition result, because in the second disease recognition, the features of two different modes of vision and inquiry are fully fused, the expressive power of the features is greatly increased, and the recognition capability and accuracy can be improved.
For example, the manifestations of eczema and acne are similar, the visual representation difference between lesions is small, and the intrinsic characteristic difference exists between the lesions, so that the two diseases can be obtained from inquiry data, and the fusion characteristics obtained by adopting fusion of the two diseases have stronger representation capability. Disease information can be identified with high accuracy by fusing features.
In an alternative embodiment, the results obtained in connection with the second disease recognition may be combined with the response data to obtain the final target recognition result. After the second disease is identified, a plurality of (two or more) identification results, and probabilities corresponding to the respective second identification results, are obtained. Some of the response data are data that the patient answers based on the question data, and the probability of a part of the recognition result can be increased or the probability of a part of the recognition result can be decreased based on the answer data. And obtaining a disease identification result corresponding to the maximum probability by combining the two probabilities, and taking the disease identification result as a final identification result.
After the first disease is identified, n (for example, 3) first identification results and n first probability values corresponding to the first identification results are obtained, and question data are obtained from a database according to the n first identification results, wherein the question data comprise multiple question data with options. For example, there are n question data with options, and the n question data correspond to n recognition results respectively. The patient answers the n questions to generate selection data. The background system readjusts the probability corresponding to the identification result according to the selection system; for example, for a certain recognition result, the patient's option is "yes" (i.e., it is represented that the patient is a disease condition that corresponds to the questioning condition), while for other recognition results, the patient's option is "no"; and increasing the probability of the identification result corresponding to the option of yes, and simultaneously reducing the probability of the identification result corresponding to the option of no to obtain new n new first probability values.
After the second disease is identified, m (e.g., 3) second identification results and m second probability values corresponding to the second identification results are obtained. The n recognition combinations obtained by the first disease recognition and the m recognition results obtained by the second disease recognition may be the same or different, for example, the recognition results of the first disease recognition and the second disease recognition are both: eczema, psoriasis, certain tinea. And recalculating the probability value of each recognition result according to the new n new first probability values and the new m second probability values. For example, n new first probability values are: a40%, B30%, C20%, m second probability values are: a50%, b20%, c20%, both can be averaged as the final probability value: and A45%, B20% and C20% and acquiring the identification result with the highest probability value as the final target identification result. Alternatively, different weights may be assigned to the n new first probability values and the m second probability values, and the final probability value may be calculated.
In an alternative embodiment, the final target recognition result may be obtained in combination with the result obtained by the first disease recognition and the result obtained by the second disease recognition.
After the first disease is identified, multiple (two or more) first identification results and corresponding probabilities can be obtained, after the second disease is identified, multiple (two or more) second identification results and corresponding probabilities can be obtained, and therefore the probabilities of the two links of the first disease identification and the second disease identification can be combined, and the identification result corresponding to the maximum probability is obtained as the final target identification result. Weight can be distributed in each link, the final probability of each recognition result is calculated, and the recognition result corresponding to the maximum probability is obtained as the final recognition result.
Referring to fig. 5, for example, a patient gets psoriasis (skin disease), and after a first disease is identified by photographing and a single-mode model, multiple results are obtained, and the first three corresponding results of probabilities (referred to as top3 results) are obtained from the multiple results: eczema, psoriasis, certain tinea. According to the result of top3, the clinical characteristics and phenotypes of the three diseases are obtained from a disease-associated knowledge base, the specified questions are presented for the diseases, and if the answer results of the patients are all yes, effective inquiry information is obtained. If the patient answers with no, it is possible that none of the top3 results of the first disease recognition hit a disease that hit a picture. The patient is asked to describe the disease (spoken and/or written) in a conventional inquiry manner. And combining effective inquiry information and description of conventional inquiry, adopting a natural language processing technology to extract inquiry characteristics, and adopting a multi-mode model to identify a second disease so as to obtain a second identification result. And combining the first recognition result and the second recognition result to obtain a recognition result with the highest probability as a final recognition result.
The above method is described below in connection with an application scenario. The patient gets the skin disease, and the mobile phone terminal shoots the image of the focus area. In the shooting process, guiding prompts (such as text prompts, popup prompts, frames for adding or identifying target acquisition areas and the like) are acquired according to the images so as to assist a patient to shoot clear and effective images. The system detects the image shot by the patient, and if the image meets the requirements, the image recognition is carried out; otherwise, prompting the patient to re-shoot until the image meeting the requirements is obtained.
After the background system obtains the image meeting the requirements, the image is subjected to feature extraction and first disease identification, and 3 first identification results and probability values corresponding to the first identification results are obtained. And acquiring the questioning data from the database according to the 3 first recognition results, and feeding back the questioning data to the mobile phone terminal.
The mobile phone terminal displays the questioning data, and the patient replies according to the questions in the questioning data, wherein part of the questions have options, namely, the patient selects the options for replying; another part of the questions requires a patient to respond to the description (text description and/or speech description). And the mobile phone terminal generates response data from the response of the patient and feeds the response data back to the background system.
The background system extracts inquiry features by adopting a natural language processing technology according to response data, inputs the inquiry features and the image features into a transformers module for fusion to obtain fusion features, and carries out second disease identification according to the inquiry features to obtain 3 identification results and probability values corresponding to the identification results. And combining the identification results of the first disease identification and the second disease identification, and acquiring the identification result corresponding to the maximum probability value as a final identification result. And feeding back the final identification result to the mobile phone terminal of the patient.
As shown in fig. 6, an embodiment of the present invention provides a disease information identification system, including:
The image data acquisition module is used for acquiring an image to be identified, wherein the image to be identified contains disease information.
Specifically, the image data acquisition module is used for acquiring and uploading an image with disease information, and can be a mobile phone terminal, a tablet personal computer, a camera and other devices. The image refers to an image obtained by shooting a disease area of a patient, wherein the image contains disease information, and the disease information refers to symptoms of the disease on the body, such as skin diseases, and red spots, acne, ulceration inflammation and the like are generated on the skin; also such as a black nail, a block of black areas is presented on the fingernail. The patient shoots the image acquisition equipment aiming at the disease part, and the shot image is used as important identification input information, so that the shot image needs to be clear and effective, otherwise, the subsequent identification result is influenced. For example, the image is not focused accurately in the shooting process, and a fuzzy place appears, so that the subsequent recognition effect is directly influenced, for example, the shot picture is not comprehensive enough, only a local area of a disease is shot, all areas are not shot, and the recognition effect is also influenced. Therefore, in the process of capturing an image, it is necessary to strictly control the quality of the image and prompt an unsatisfactory image to require re-shooting, and referring to fig. 2, step S110 may further include:
Step S111: acquiring an image containing a focus area according to a preset guidance prompt, wherein the guidance prompt is used for prompting the position of the focus area;
Step S112: detecting whether the image containing the focus area meets the acquisition requirement, if so, taking the image containing the focus area as an image to be identified, and if not, prompting to acquire the image containing the focus area again.
In the following explanation, the image is acquired by the mobile phone terminal as an example, when a patient opens the camera to shoot a focus area, a corresponding area frame can be displayed in the mobile phone screen, so that the patient is prompted to aim the area frame at the focus area as much as possible to shoot. In addition, illumination of the global area and the local area can be monitored in real time, the conditions that illumination is too dark, insufficient light exists and a picture is too dark are avoided, whether the focal length accords with a set interval can be detected in real time, and when the focal length accords with the requirements, a shot image is prompted. After the patient shoots the image, the mobile phone can automatically upload the image to a background system, namely a system for identifying the image by disease, after the system receives the identified image, the system can firstly judge the image, judge whether the brightness, the fuzzy value and the like of the image meet the requirements, if so, the image can be identified, and if not, the information is fed back to the mobile phone terminal to prompt the patient to take a picture again. Through the limitation of the requirements, consistency of the obtained dermatological picture color, illumination, brightness, fuzzy value and the like is ensured, so that the subsequent recognition effect is improved. It should be noted here that the detailed steps listed in this embodiment are optional execution steps, and do not necessarily represent that the disease information identifying method of the present invention needs to execute all the steps.
And the first feature extraction module is used for carrying out feature extraction according to the image to be identified to obtain image features.
In particular, a depth network with a multi-scale feature pyramid may be employed for feature extraction. There are two main types of ways to handle multiscales in visual tasks: image pyramids and feature pyramids. The feature pyramid is similar to combining multi-scale feature fusion and multi-scale prediction, and the semantic information of a high layer is gradually propagated to a low layer through upsampling and lateral connection. The specific practice is to up-sample the features of the higher layer by 2 times, change the channel number by using 1X1 convolution for the features of the lower layer, and then add the results of the two. Thus, the feature images of each layer are equivalent to the features fused with different resolutions, so that objects with corresponding resolutions are detected, and each layer is ensured to have proper resolution and strong semantic features, so that small targets can be detected, and stronger semantic information for classification is ensured. The feature map of each resolution output is sent to the subsequent process independently for target detection, and the feature maps of different sizes are selected for detection (i.e. multi-size processing) for the regions of interest (ROIs, region of interest) of different sizes, so that the subsequent recognition accuracy can be improved.
The first recognition module is used for carrying out first disease recognition according to the image characteristics to obtain a plurality of first recognition results.
In step S130, a lightweight, efficient single-mode disease classification model may be used to target images, as shown in fig. 3, using a deep network with multi-scale feature pyramids as the backbone, and then connecting the attention mechanism module and the spatial pyramid pooling layer to focus on global and local features. In fig. 3, the Attention Module is an Attention mechanism Module, the SPP is a spatial pyramid pooling layer, the FC is fully connected, and the Softmax is a Softmax classifier. The attention mechanism module includes a channel attention unit (channel Attention Module) and a spatial attention unit (spatial Attention Module).
The channel attention unit (Channel Attention Module) compresses the feature map in the space dimension to obtain a one-dimensional vector and then operates. When compression is performed in the spatial dimension, not only mean pooling (Average Pooling) but also maximum pooling (Max Pooling) are considered. Average pooling and maximum pooling can be used to aggregate spatial information of feature maps, send to a shared network, compress the spatial dimensions of the input feature map, and sum and combine element-by-element to produce a channel attention map. Channel attention, singly in the case of a graph, is focused on what is important on the graph. The average pooling has feedback for every pixel on the feature map, while the maximum pooling has gradient feedback only where the response is greatest in the feature map when performing the gradient back propagation calculation.
The spatial attention unit (Spatial Attention Module) compresses the channels, and average pooling AvgPool and maximum pooling MaxPool are performed in the channel dimensions, respectively. MaxPool is to extract the maximum value on the channel, and the number of times of extraction is high multiplied by the width; avgPool is to extract the average value on the channel, and the number of times of extraction is also high multiplied by the width; the previously extracted feature maps (channel numbers are all 1) are then combined to obtain a 2-channel feature map.
And the data acquisition module is used for acquiring the question data corresponding to the first identification result from the database according to the first identification result.
By means of the recognition of the first recognition module, a plurality of first recognition results, such as one or more recognition results, are obtained, when one recognition result is obtained, the recognition result is not disputed, and the subsequent processing is relatively simple and is not discussed in detail herein. When two or more identification results are obtained, question data corresponding to the first identification result is obtained from a database according to the identification results, wherein the database is a pre-established database, and can be used for constructing corresponding question templates for common disease types by advanced doctors to form a knowledge base, and can also be from the Internet or a question database provided by a third party.
In some embodiments, when the number of recognized results is relatively large, for example, the number of recognized results exceeds 3, the first few recognized results (for example, the first three recognized results) with relatively large probability may be obtained as final recognized results according to the probability of each recognized result, and the question data may be obtained according to the final recognized results. The acquired questioning data can be returned to terminal equipment such as a mobile phone, a tablet computer, a notebook computer and the like held by the patient for display. The questioning data may be returned to the mobile phone terminal of the patient in text form and/or voice form, for example, the identified result is eczema, psoriasis or tinea, and the clinical characteristics and phenotype of the three diseases are obtained according to the disease association knowledge base and the corresponding questioning data is fed back.
And the response data acquisition module is used for acquiring response data corresponding to the question data.
Specifically, after the question data is fed back to the mobile phone terminal of the patient, the question data can be displayed in a text form and/or a voice form, the patient can answer according to the content of the question data, the answer form can be text input or voice input, and response data is generated according to the input data and sent to the background system.
In an alternative embodiment, the question data may include question data of options. The following text question data is exemplified, for example question 1, including option ABCD option, or including option "yes" or "no", because after image recognition, the general etiology is already locked in a certain range (such as three recognition results), and corresponding question data, i.e. data for distinguishing the three recognition results, is obtained for the three recognition results. For example, if a question for a certain disease is answered "yes" and a question for another disease is answered "no", information about the corresponding disease can be further determined.
If effective data cannot be obtained through the question mode of the options, that is, it is difficult to further determine the distinguishing and identifying results according to the answers of the patient, for example, if the answers of the patient for all questions are 'no', the three possibly identified results do not include the true diseases suffered by the patient, and the patient is required to describe the diseases according to the question data to form question feature information, wherein the description is text description or voice description. The patient describes the shape, size, distribution, disease position, time and other conditions of the disease, generates corresponding response data, and feeds the response data back to the background system.
And the second feature extraction module is used for carrying out natural language processing on the response data and extracting to obtain inquiry features.
In step S160, natural language processing (Neuro-Linguistic Programming, NLP) may be used to extract data from the answer data, and query features may be extracted, where the query features include shape description, location description, size description, color description, and the like of the disease. For example, the visual manifestations of eczema and acne are similar, the visual representation difference between lesions is small, but the essential characteristic difference exists between the two diseases, the two diseases can be obtained from inquiry data, and the recognition accuracy can be greatly improved by fusing the expressive force of the abundant characteristics of the inquiry characteristics.
And the feature fusion module is used for inputting the inquiry features and the image features into the transformers module for feature fusion to obtain fusion features.
Specifically, the natural language processing technology is adopted to extract the inquiry features in the response data, and the image features extracted by the depth network of the feature pyramid are input to the transformers module to perform feature fusion, so that the expressive power of the features is improved.
And the second recognition module is used for carrying out second disease recognition according to the fusion characteristics to obtain a target recognition result.
Specifically, the second disease identification in step S180 may use a multi-modal disease classification model for identifying and classifying the fusion features, see fig. 4, where the multi-modal disease classification model includes transformers encoders and a class classifier. Transformers are used to fuse visual depth features (i.e., image features) with text depth features (i.e., interview features) to further refine the classification result. The multi-modal disease classification model can perform data element fusion and training of different modalities based on element learning. The obtained target identification result can be sent to terminal equipment such as a smart phone, a notebook computer and the like held by a patient for display, so that the patient can acquire disease information in time; the medical treatment device can also be sent to terminal equipment such as a smart phone and a notebook computer held by a doctor or sent to a desktop computer, a background server and the like which can be referred by the doctor for display, so that the doctor can know the illness state fully and formulate a corresponding treatment scheme.
In an alternative embodiment, the second recognition result obtained by the second disease recognition may be used as the target recognition result, because in the second disease recognition, the features of two different modes of vision and inquiry are fully fused, the expressive power of the features is greatly increased, and the recognition capability and accuracy can be improved.
For example, the manifestations of eczema and acne are similar, the visual representation difference between lesions is small, and the intrinsic characteristic difference exists between the lesions, so that the two diseases can be obtained from inquiry data, and the fusion characteristics obtained by adopting fusion of the two diseases have stronger representation capability. Disease information can be identified with high accuracy by fusing features.
In an alternative embodiment, the results obtained in connection with the second disease recognition are combined with the response data to obtain the final target recognition result. After the second disease is identified, a plurality of (two or more) identification results, and probabilities corresponding to the respective second identification results, are obtained. Some of the response data are data that the patient answers based on the question data, and the probability of a part of the recognition result can be increased or the probability of a part of the recognition result can be decreased based on the answer data. And obtaining a disease identification result corresponding to the maximum probability by combining the two probabilities, and taking the disease identification result as a final identification result.
After the first disease is identified, n (for example, 3) first identification results and n first probability values corresponding to the first identification results are obtained, and question data are obtained from a database according to the n first identification results, wherein the question data comprise multiple question data with options. For example, there are n question data with options, and the n question data correspond to n recognition results respectively. The patient answers the n questions to generate selection data. The background system readjusts the probability corresponding to the identification result according to the selection system; for example, for a certain recognition result, the patient's option is "yes" (i.e., it is represented that the patient is a disease condition that corresponds to the questioning condition), while for other recognition results, the patient's option is "no"; and increasing the probability of the identification result corresponding to the option of yes, and simultaneously reducing the probability of the identification result corresponding to the option of no to obtain new n new first probability values.
After the second disease is identified, m (e.g., 3) second identification results and m second probability values corresponding to the second identification results are obtained. The n recognition combinations obtained by the first disease recognition and the m recognition results obtained by the second disease recognition may be the same or different, for example, the recognition results of the first disease recognition and the second disease recognition are both: eczema, psoriasis, certain tinea. And recalculating the probability value of each recognition result according to the new n new first probability values and the new m second probability values. For example, n new first probability values are: a40%, B30%, C20%, m second probability values are: a50%, b20%, c20%, both can be averaged as the final probability value: and A45%, B20% and C20% and acquiring the identification result with the highest probability value as the final target identification result. Alternatively, different weights may be assigned to the n new first probability values and the m second probability values, and the final probability value may be calculated.
In an alternative embodiment, the final target recognition result may be obtained in combination with the result obtained by the first disease recognition and the result obtained by the second disease recognition.
After the first disease is identified, multiple (two or more) first identification results and corresponding probabilities can be obtained, after the second disease is identified, multiple (two or more) second identification results and corresponding probabilities can be obtained, and therefore the probabilities of the two links of the first disease identification and the second disease identification can be combined, and the identification result corresponding to the maximum probability is obtained as the final target identification result. Weight can be distributed in each link, the final probability of each recognition result is calculated, and the recognition result corresponding to the maximum probability is obtained as the final recognition result.
Referring to fig. 5, for example, a patient gets psoriasis (skin disease), and after a first disease is identified by photographing and a single-mode model, multiple results are obtained, and the first three corresponding results of probabilities (referred to as top3 results) are obtained from the multiple results: eczema, psoriasis, certain tinea. According to the result of top3, the clinical characteristics and phenotypes of the three diseases are obtained from a disease-associated knowledge base, the specified questions are presented for the diseases, and if the answer results of the patients are all yes, effective inquiry information is obtained. If the patient answers with no, it is possible that none of the top3 results of the first disease recognition hit a disease that hit a picture. The patient is asked to describe the disease (spoken and/or written) in a conventional inquiry manner. And combining effective inquiry information and description of conventional inquiry, adopting a natural language processing technology to extract inquiry characteristics, and adopting a multi-mode model to identify a second disease so as to obtain a second identification result. And combining the first recognition result and the second recognition result to obtain a recognition result with the highest probability as a final recognition result.
As shown in fig. 7, an embodiment of the present invention proposes a disease information identifying apparatus 30, the apparatus 30 including a memory 31, a processor 32, a program stored on the memory and executable on the processor, and a data bus 33 for realizing connection communication between the processor 31 and the memory 32, the program when executed by the processor to realize the following specific steps as shown in fig. 1:
step S110: and acquiring an image to be identified, wherein the image to be identified contains disease information.
Specifically, the image to be identified can be uploaded after being acquired by a patient through an image acquisition device, wherein the image acquisition device can be a mobile phone, a tablet personal computer, a camera and the like. The image refers to an image obtained by shooting a disease area of a patient, wherein the image contains disease information, and the disease information refers to symptoms of the disease on the body, such as skin diseases, and red spots, acne, ulceration inflammation and the like are generated on the skin; also such as a black nail, a block of black areas is presented on the fingernail. The patient shoots the image acquisition equipment aiming at the disease part, and the shot image is used as important identification input information, so that the shot image needs to be clear and effective, otherwise, the subsequent identification result is influenced. For example, the image is not focused accurately in the shooting process, and a fuzzy place appears, so that the subsequent recognition effect is directly influenced, for example, the shot picture is not comprehensive enough, only a local area of a disease is shot, all areas are not shot, and the recognition effect is also influenced. Therefore, in the process of capturing an image, it is necessary to strictly control the quality of the image and prompt an unsatisfactory image to require re-shooting, and referring to fig. 2, step S110 may further include:
Step S111: acquiring an image containing a focus area according to a preset guidance prompt, wherein the guidance prompt is used for prompting the position of the focus area;
Step S112: detecting whether the image containing the focus area meets the acquisition requirement, if so, taking the image containing the focus area as an image to be identified, and if not, prompting to acquire the image containing the focus area again.
In step S111, the preset guiding prompt may be a text prompt, a popup prompt, a frame for adding or identifying the target acquisition area, etc., which is not limited in the embodiment of the present application.
For example, taking a preset guiding prompt as a frame (abbreviated as a region frame) of an additional target acquisition region, explaining by taking an image acquired by a mobile phone as an example, when a patient opens a camera to shoot a focus region, a corresponding region frame can be displayed in a mobile phone screen, and prompting the patient to aim the region frame at the focus region as much as possible to shoot. In addition, illumination of the global area and the local area can be monitored in real time, the conditions that illumination is too dark, insufficient light exists and a picture is too dark are avoided, whether the focal length accords with a set interval can be detected in real time, and when the focal length accords with the requirements, a shot image is prompted. After the patient shoots the image, the mobile phone can automatically upload the image to a background system, namely a system for identifying the image by disease, after the system receives the identified image, the system can firstly judge the image, judge whether the brightness, the fuzzy value and the like of the image meet the requirements, if so, the image can be identified, and if not, the information is fed back to the mobile phone terminal to prompt the patient to take a picture again. Through the limitation of the requirements, consistency of the obtained dermatological picture color, illumination, brightness, fuzzy value and the like is ensured, so that the subsequent recognition effect is improved. It should be noted here that the detailed steps listed in this embodiment are optional execution steps, and do not necessarily represent that the disease information identifying method of the present invention needs to execute all the steps.
Step S120: and extracting features according to the image to be identified to obtain image features.
In particular, a depth network with a multi-scale feature pyramid may be employed for feature extraction. There are two main types of ways to handle multiscales in visual tasks: image pyramids and feature pyramids. The feature pyramid is similar to combining multi-scale feature fusion and multi-scale prediction, and the semantic information of a high layer is gradually propagated to a low layer through upsampling and lateral connection. The specific practice is to up-sample the features of the higher layer by 2 times, change the channel number by using 1X1 convolution for the features of the lower layer, and then add the results of the two. Thus, the feature images of each layer are equivalent to the features fused with different resolutions, so that objects with corresponding resolutions are detected, and each layer is ensured to have proper resolution and strong semantic features, so that small targets can be detected, and stronger semantic information for classification is ensured. The feature map of each resolution output is sent to the subsequent process independently for target detection, and the feature maps of different sizes are selected for detection (i.e. multi-size processing) for the regions of interest (ROIs, region of interest) of different sizes, so that the subsequent recognition accuracy can be improved.
Step S130: and carrying out first disease identification according to the image characteristics to obtain a plurality of first identification results.
In step S130, a lightweight, efficient single-mode disease classification model may be used to target images, as shown in fig. 3, using a deep network with multi-scale feature pyramids as the backbone, and then connecting the attention mechanism module and the spatial pyramid pooling layer to focus on global and local features. In fig. 3, the Attention Module is an Attention mechanism Module, the SPP is a spatial pyramid pooling layer, the FC is fully connected, and the Softmax is a Softmax classifier. The attention mechanism module includes a channel attention unit (channel Attention Module) and a spatial attention unit (spatial Attention Module).
The channel attention unit (Channel Attention Module) compresses the feature map in the space dimension to obtain a one-dimensional vector and then operates. When compression is performed in the spatial dimension, not only mean pooling (Average Pooling) but also maximum pooling (Max Pooling) are considered. Average pooling and maximum pooling can be used to aggregate spatial information of feature maps, send to a shared network, compress the spatial dimensions of the input feature map, and sum and combine element-by-element to produce a channel attention map. Channel attention, singly in the case of a graph, is focused on what is important on the graph. The average pooling has feedback for every pixel on the feature map, while the maximum pooling has gradient feedback only where the response is greatest in the feature map when performing the gradient back propagation calculation.
The spatial attention unit (Spatial Attention Module) compresses the channels, and average pooling AvgPool and maximum pooling MaxPool are performed in the channel dimensions, respectively. MaxPool is to extract the maximum value on the channel, and the number of times of extraction is high multiplied by the width; avgPool is to extract the average value on the channel, and the number of times of extraction is also high multiplied by the width; the previously extracted feature maps (channel numbers are all 1) are then combined to obtain a 2-channel feature map.
Step S140: and acquiring questioning data corresponding to the first recognition result from the database according to the first recognition result.
Through the recognition in step S130, a plurality of first recognition results, such as one or more recognition results, are obtained, and when one recognition result is obtained, it represents that the recognition result is not controversial, and the subsequent processing is relatively simple, which will not be discussed in detail herein. When two or more identification results are obtained, question data corresponding to the first identification result is obtained from a database according to the identification results, wherein the database is a pre-established database, and can be used for constructing corresponding question templates for common disease types by advanced doctors to form a knowledge base, and can also be from the Internet or a question database provided by a third party.
In some embodiments, when the number of recognized results is relatively large, for example, the number of recognized results exceeds 3, the first few recognized results (for example, the first three recognized results) with relatively large probability may be obtained as final recognized results according to the probability of each recognized result, and the question data may be obtained according to the final recognized results. The acquired questioning data can be returned to terminal equipment such as a mobile phone, a tablet computer, a notebook computer and the like held by the patient for display. The questioning data may be returned to the mobile phone terminal of the patient in text form and/or voice form, for example, the identified result is eczema, psoriasis or tinea, and the clinical characteristics and phenotype of the three diseases are obtained according to the disease association knowledge base and the corresponding questioning data is fed back.
Step S150: response data corresponding to the question data is acquired.
Specifically, after the question data is fed back to the mobile phone terminal of the patient, the question data can be displayed in a text form and/or a voice form, the patient can answer according to the content of the question data, the answer form can be text input or voice input, and response data is generated according to the input data and sent to the background system.
In an alternative embodiment, the question data may include question data of options. The following text question data is exemplified, for example question 1, including option ABCD option, or including option "yes" or "no", because after image recognition, the general etiology is already locked in a certain range (such as three recognition results), and corresponding question data, i.e. data for distinguishing the three recognition results, is obtained for the three recognition results. For example, if a question for a certain disease is answered "yes" and a question for another disease is answered "no", information about the corresponding disease can be further determined.
If effective data cannot be obtained through the question mode of the options, that is, it is difficult to further determine the distinguishing and identifying results according to the answers of the patient, for example, if the answers of the patient for all questions are 'no', the three possibly identified results do not include the true diseases suffered by the patient, and the patient is required to describe the diseases according to the question data to form question feature information, wherein the description is text description or voice description. The patient describes the shape, size, distribution, disease position, time and other conditions of the disease, generates corresponding response data, and feeds the response data back to the background system.
Step S160: and carrying out natural language processing on the response data, and extracting to obtain inquiry features.
In step S160, natural language processing (Neuro-Linguistic Programming, NLP) may be used to extract data from the answer data, and query features may be extracted, where the query features include shape description, location description, size description, color description, and the like of the disease. For example, the visual manifestations of eczema and acne are similar, the visual representation difference between lesions is small, but the essential characteristic difference exists between the two diseases, the two diseases can be obtained from inquiry data, and the recognition accuracy can be greatly improved by fusing the expressive force of the abundant characteristics of the inquiry characteristics.
Step S170: and inputting the inquiry features and the image features into a transformers module for feature fusion to obtain fusion features.
Specifically, the natural language processing technology is adopted to extract the inquiry features in the response data, and the image features extracted by the depth network of the feature pyramid are input to the transformers module to perform feature fusion, so that the expressive power of the features is improved.
Step S180: and carrying out second disease identification according to the fusion characteristics to obtain a target identification result.
Specifically, the second disease identification in step S180 may use a multi-modal disease classification model for identifying and classifying the fusion features, see fig. 4, where the multi-modal disease classification model includes transformers encoders and a class classifier. transformers are used to fuse visual depth features (i.e., image features) with text depth features (i.e., interview features) to further refine the classification result. The multi-modal disease classification model can perform data element fusion and training of different modalities based on element learning. The obtained target identification result can be sent to terminal equipment such as a smart phone, a notebook computer and the like held by a patient for display, so that the patient can acquire disease information in time; the medical treatment device can also be sent to terminal equipment such as a smart phone and a notebook computer held by a doctor or sent to a desktop computer, a background server and the like which can be referred by the doctor for display, so that the doctor can know the illness state fully and formulate a corresponding treatment scheme.
In an alternative embodiment, the second recognition result obtained by the second disease recognition may be used as the target recognition result, because in the second disease recognition, the features of two different modes of vision and inquiry are fully fused, the expressive power of the features is greatly increased, and the recognition capability and accuracy can be improved.
For example, the manifestations of eczema and acne are similar, the visual representation difference between lesions is small, and the intrinsic characteristic difference exists between the lesions, so that the two diseases can be obtained from inquiry data, and the fusion characteristics obtained by adopting fusion of the two diseases have stronger representation capability. Disease information can be identified with high accuracy by fusing features.
In an alternative embodiment, the results obtained in connection with the second disease recognition may be combined with the response data to obtain the final target recognition result. After the second disease is identified, a plurality of (two or more) identification results, and probabilities corresponding to the respective second identification results, are obtained. Some of the response data are data that the patient answers based on the question data, and the probability of a part of the recognition result can be increased or the probability of a part of the recognition result can be decreased based on the answer data. And obtaining a disease identification result corresponding to the maximum probability by combining the two probabilities, and taking the disease identification result as a final identification result.
After the first disease is identified, n (for example, 3) first identification results and n first probability values corresponding to the first identification results are obtained, and question data are obtained from a database according to the n first identification results, wherein the question data comprise multiple question data with options. For example, there are n question data with options, and the n question data correspond to n recognition results respectively. The patient answers the n questions to generate selection data. The background system readjusts the probability corresponding to the identification result according to the selection system; for example, for a certain recognition result, the patient's option is "yes" (i.e., it is represented that the patient is a disease condition that corresponds to the questioning condition), while for other recognition results, the patient's option is "no"; and increasing the probability of the identification result corresponding to the option of yes, and simultaneously reducing the probability of the identification result corresponding to the option of no to obtain new n new first probability values.
After the second disease is identified, m (e.g., 3) second identification results and m second probability values corresponding to the second identification results are obtained. The n recognition combinations obtained by the first disease recognition and the m recognition results obtained by the second disease recognition may be the same or different, for example, the recognition results of the first disease recognition and the second disease recognition are both: eczema, psoriasis, certain tinea. And recalculating the probability value of each recognition result according to the new n new first probability values and the new m second probability values. For example, n new first probability values are: a40%, B30%, C20%, m second probability values are: a50%, b20%, c20%, both can be averaged as the final probability value: and A45%, B20% and C20% and acquiring the identification result with the highest probability value as the final target identification result. Alternatively, different weights may be assigned to the n new first probability values and the m second probability values, and the final probability value may be calculated.
In an alternative embodiment, the final target recognition result may be obtained in combination with the result obtained by the first disease recognition and the result obtained by the second disease recognition.
After the first disease is identified, multiple (two or more) first identification results and corresponding probabilities can be obtained, after the second disease is identified, multiple (two or more) second identification results and corresponding probabilities can be obtained, and therefore the probabilities of the two links of the first disease identification and the second disease identification can be combined, and the identification result corresponding to the maximum probability is obtained as the final target identification result. Weight can be distributed in each link, the final probability of each recognition result is calculated, and the recognition result corresponding to the maximum probability is obtained as the final recognition result.
Referring to fig. 5, for example, a patient gets psoriasis (skin disease), and after a first disease is identified by photographing and a single-mode model, multiple results are obtained, and the first three corresponding results of probabilities (referred to as top3 results) are obtained from the multiple results: eczema, psoriasis, certain tinea. According to the result of top3, the clinical characteristics and phenotypes of the three diseases are obtained from a disease-associated knowledge base, the specified questions are presented for the diseases, and if the answer results of the patients are all yes, effective inquiry information is obtained. If the patient answers with no, it is possible that none of the top3 results of the first disease recognition hit a disease that hit a picture. The patient is asked to describe the disease (spoken and/or written) in a conventional inquiry manner. And combining effective inquiry information and description of conventional inquiry, adopting a natural language processing technology to extract inquiry characteristics, and adopting a multi-mode model to identify a second disease so as to obtain a second identification result. And combining the first recognition result and the second recognition result to obtain a recognition result with the highest probability as a final recognition result.
The above method is described below in connection with an application scenario. The patient gets the skin disease, and the mobile phone terminal shoots the image of the focus area. In the shooting process, guiding prompts (such as text prompts, popup prompts, frames for adding or identifying target acquisition areas and the like) are acquired according to the images so as to assist a patient to shoot clear and effective images. The system detects the image shot by the patient, and if the image meets the requirements, the image recognition is carried out; otherwise, prompting the patient to re-shoot until the image meeting the requirements is obtained.
After the background system obtains the image meeting the requirements, the image is subjected to feature extraction and first disease identification, and 3 first identification results and probability values corresponding to the first identification results are obtained. And acquiring the questioning data from the database according to the 3 first recognition results, and feeding back the questioning data to the mobile phone terminal.
The mobile phone terminal displays the questioning data, and the patient replies according to the questions in the questioning data, wherein part of the questions have options, namely, the patient selects the options for replying; another part of the questions requires a patient to respond to the description (text description and/or speech description). And the mobile phone terminal generates response data from the response of the patient and feeds the response data back to the background system.
The background system extracts inquiry features by adopting a natural language processing technology according to response data, inputs the inquiry features and the image features into a transformers module for fusion to obtain fusion features, and carries out second disease identification according to the inquiry features to obtain 3 identification results and probability values corresponding to the identification results. And combining the identification results of the first disease identification and the second disease identification, and acquiring the identification result corresponding to the maximum probability value as a final identification result. And feeding back the final identification result to the mobile phone terminal of the patient.
An embodiment of the present invention proposes a computer-readable storage medium storing one or more programs executable by one or more processors to implement the following specific steps as shown in fig. 1:
step S110: and acquiring an image to be identified, wherein the image to be identified contains disease information.
Specifically, the image to be identified can be uploaded after being acquired by a patient through an image acquisition device, wherein the image acquisition device can be a mobile phone, a tablet personal computer, a camera and the like. The image refers to an image obtained by shooting a disease area of a patient, wherein the image contains disease information, and the disease information refers to symptoms of the disease on the body, such as skin diseases, and red spots, acne, ulceration inflammation and the like are generated on the skin; also such as a black nail, a block of black areas is presented on the fingernail. The patient shoots the image acquisition equipment aiming at the disease part, and the shot image is used as important identification input information, so that the shot image needs to be clear and effective, otherwise, the subsequent identification result is influenced. For example, the image is not focused accurately in the shooting process, and a fuzzy place appears, so that the subsequent recognition effect is directly influenced, for example, the shot picture is not comprehensive enough, only a local area of a disease is shot, all areas are not shot, and the recognition effect is also influenced. Therefore, in the process of capturing an image, it is necessary to strictly control the quality of the image and prompt an unsatisfactory image to require re-shooting, and referring to fig. 2, step S110 may further include:
Step S111: acquiring an image containing a focus area according to a preset guidance prompt, wherein the guidance prompt is used for prompting the position of the focus area;
Step S112: detecting whether the image containing the focus area meets the acquisition requirement, if so, taking the image containing the focus area as an image to be identified, and if not, prompting to acquire the image containing the focus area again.
In the following explanation, the image is obtained by using a mobile phone as an example, when a patient opens a camera to shoot a focus area, a corresponding area frame can be displayed in a mobile phone screen, so that the patient is prompted to aim the area frame at the focus area as much as possible to shoot. In addition, illumination of the global area and the local area can be monitored in real time, the conditions that illumination is too dark, insufficient light exists and a picture is too dark are avoided, whether the focal length accords with a set interval can be detected in real time, and when the focal length accords with the requirements, a shot image is prompted. After the patient shoots the image, the mobile phone can automatically upload the image to a background system, namely a system for identifying the image by disease, after the system receives the identified image, the system can firstly judge the image, judge whether the brightness, the fuzzy value and the like of the image meet the requirements, if so, the image can be identified, and if not, the information is fed back to the mobile phone terminal to prompt the patient to take a picture again. Through the limitation of the requirements, consistency of the obtained dermatological picture color, illumination, brightness, fuzzy value and the like is ensured, so that the subsequent recognition effect is improved. It should be noted here that the detailed steps listed in this embodiment are optional execution steps, and do not necessarily represent that the disease information identifying method of the present invention needs to execute all the steps.
Step S120: and extracting features according to the image to be identified to obtain image features.
In particular, a depth network with a multi-scale feature pyramid may be employed for feature extraction. There are two main types of ways to handle multiscales in visual tasks: image pyramids and feature pyramids. The feature pyramid is similar to combining multi-scale feature fusion and multi-scale prediction, and the semantic information of a high layer is gradually propagated to a low layer through upsampling and lateral connection. The specific practice is to up-sample the features of the higher layer by 2 times, change the channel number by using 1X1 convolution for the features of the lower layer, and then add the results of the two. Thus, the feature images of each layer are equivalent to the features fused with different resolutions, so that objects with corresponding resolutions are detected, and each layer is ensured to have proper resolution and strong semantic features, so that small targets can be detected, and stronger semantic information for classification is ensured. The feature map of each resolution output is sent to the subsequent process independently for target detection, and the feature maps of different sizes are selected for detection (i.e. multi-size processing) for the regions of interest (ROIs, region of interest) of different sizes, so that the subsequent recognition accuracy can be improved.
Step S130: and carrying out first disease identification according to the image characteristics to obtain a plurality of first identification results.
In step S130, a lightweight, efficient single-mode disease classification model may be used to target images, as shown in fig. 3, using a deep network with multi-scale feature pyramids as the backbone, and then connecting the attention mechanism module and the spatial pyramid pooling layer to focus on global and local features. In fig. 3, the Attention Module is an Attention mechanism Module, the SPP is a spatial pyramid pooling layer, the FC is fully connected, and the Softmax is a Softmax classifier. The attention mechanism module includes a channel attention unit (channel Attention Module) and a spatial attention unit (spatial Attention Module).
The channel attention unit (Channel Attention Module) compresses the feature map in the space dimension to obtain a one-dimensional vector and then operates. When compression is performed in the spatial dimension, not only mean pooling (Average Pooling) but also maximum pooling (Max Pooling) are considered. Average pooling and maximum pooling can be used to aggregate spatial information of feature maps, send to a shared network, compress the spatial dimensions of the input feature map, and sum and combine element-by-element to produce a channel attention map. Channel attention, singly in the case of a graph, is focused on what is important on the graph. The average pooling has feedback for every pixel on the feature map, while the maximum pooling has gradient feedback only where the response is greatest in the feature map when performing the gradient back propagation calculation.
The spatial attention unit (Spatial Attention Module) compresses the channels, and average pooling AvgPool and maximum pooling MaxPool are performed in the channel dimensions, respectively. MaxPool is to extract the maximum value on the channel, and the number of times of extraction is high multiplied by the width; avgPool is to extract the average value on the channel, and the number of times of extraction is also high multiplied by the width; the previously extracted feature maps (channel numbers are all 1) are then combined to obtain a 2-channel feature map.
Step S140: and acquiring questioning data corresponding to the first recognition result from the database according to the first recognition result.
Through the recognition in step S130, a plurality of first recognition results, such as one or more recognition results, are obtained, and when one recognition result is obtained, it represents that the recognition result is not controversial, and the subsequent processing is relatively simple, which will not be discussed in detail herein. When two or more identification results are obtained, question data corresponding to the first identification result is obtained from a database according to the identification results, wherein the database is a pre-established database, and can be used for constructing corresponding question templates for common disease types by advanced doctors to form a knowledge base, and can also be from the Internet or a question database provided by a third party.
In some embodiments, when the number of recognized results is relatively large, for example, the number of recognized results exceeds 3, the first few recognized results (for example, the first three recognized results) with relatively large probability may be obtained as final recognized results according to the probability of each recognized result, and the question data may be obtained according to the final recognized results. The acquired questioning data can be returned to terminal equipment such as a mobile phone, a tablet computer, a notebook computer and the like held by the patient for display. The questioning data may be returned to the mobile phone terminal of the patient in text form and/or voice form, for example, the identified result is eczema, psoriasis or tinea, and the clinical characteristics and phenotype of the three diseases are obtained according to the disease association knowledge base and the corresponding questioning data is fed back.
Step S150: response data corresponding to the question data is acquired.
Specifically, after the question data is fed back to the mobile phone terminal of the patient, the question data can be displayed in a text form and/or a voice form, the patient can answer according to the content of the question data, the answer form can be text input or voice input, and response data is generated according to the input data and sent to the background system.
In an alternative embodiment, the question data may include question data of options. The following text question data is exemplified, for example question 1, including option ABCD option, or including option "yes" or "no", because after image recognition, the general etiology is already locked in a certain range (such as three recognition results), and corresponding question data, i.e. data for distinguishing the three recognition results, is obtained for the three recognition results. For example, if a question for a certain disease is answered "yes" and a question for another disease is answered "no", information about the corresponding disease can be further determined.
If effective data cannot be obtained through the question mode of the options, that is, it is difficult to further determine the distinguishing and identifying results according to the answers of the patient, for example, if the answers of the patient for all questions are 'no', the three possibly identified results do not include the true diseases suffered by the patient, and the patient is required to describe the diseases according to the question data to form question feature information, wherein the description is text description or voice description. The patient describes the shape, size, distribution, disease position, time and other conditions of the disease, generates corresponding response data, and feeds the response data back to the background system.
Step S160: and carrying out natural language processing on the response data, and extracting to obtain inquiry features.
In step S160, natural language processing (Neuro-Linguistic Programming, NLP) may be used to extract data from the answer data, and query features may be extracted, where the query features include shape description, location description, size description, color description, and the like of the disease. For example, the visual manifestations of eczema and acne are similar, the visual representation difference between lesions is small, but the essential characteristic difference exists between the two diseases, the two diseases can be obtained from inquiry data, and the recognition accuracy can be greatly improved by fusing the expressive force of the abundant characteristics of the inquiry characteristics.
Step S170: and inputting the inquiry features and the image features into a transformers module for feature fusion to obtain fusion features.
Specifically, the natural language processing technology is adopted to extract the inquiry features in the response data, and the image features extracted by the depth network of the feature pyramid are input to the transformers module to perform feature fusion, so that the expressive power of the features is improved.
Step S180: and carrying out second disease identification according to the fusion characteristics to obtain a target identification result.
Specifically, the second disease identification in step S180 may use a multi-modal disease classification model for identifying and classifying the fusion features, see fig. 4, where the multi-modal disease classification model includes transformers encoders and a class classifier. Transformers are used to fuse visual depth features (i.e., image features) with text depth features (i.e., interview features) to further refine the classification result. The multi-modal disease classification model can perform data element fusion and training of different modalities based on element learning. The obtained target identification result can be sent to terminal equipment such as a smart phone, a notebook computer and the like held by a patient for display, so that the patient can acquire disease information in time; the medical treatment device can also be sent to terminal equipment such as a smart phone and a notebook computer held by a doctor or sent to a desktop computer, a background server and the like which can be referred by the doctor for display, so that the doctor can know the illness state fully and formulate a corresponding treatment scheme.
In an alternative embodiment, the result obtained by the second disease recognition is used as the final recognition result, because in the second disease recognition, the features of two different modes of vision and inquiry are fully fused, the expressive power of the features is greatly increased, and the recognition capability and accuracy can be improved.
For example, the manifestations of eczema and acne are similar, the visual representation difference between lesions is small, and the intrinsic characteristic difference exists between the lesions, so that the two diseases can be obtained from inquiry data, and the fusion characteristics obtained by adopting fusion of the two diseases have stronger representation capability. Disease information can be identified with high accuracy by fusing features.
In an alternative embodiment, the results obtained in connection with the second disease recognition are combined with the response data to obtain the final target recognition result. After the second disease is identified, a plurality of (two or more) identification results, and probabilities corresponding to the respective second identification results, are obtained. Some of the response data are data that the patient answers based on the question data, and the probability of a part of the recognition result can be increased or the probability of a part of the recognition result can be decreased based on the answer data. And obtaining a disease identification result corresponding to the maximum probability by combining the two probabilities, and taking the disease identification result as a final identification result.
In an alternative embodiment, the final target recognition result may be obtained in combination with the result obtained by the first disease recognition and the result obtained by the second disease recognition.
After the first disease is identified, multiple (two or more) first identification results and corresponding probabilities can be obtained, after the second disease is identified, multiple (two or more) second identification results and corresponding probabilities can be obtained, and therefore the probabilities of the two links of the first disease identification and the second disease identification can be combined, and the identification result corresponding to the maximum probability is obtained as the final target identification result. Weight can be distributed in each link, the final probability of each recognition result is calculated, and the recognition result corresponding to the maximum probability is obtained as the final recognition result.
Referring to fig. 5, for example, a patient gets psoriasis (skin disease), and after a first disease is identified by photographing and a single-mode model, multiple results are obtained, and the first three corresponding results of probabilities (referred to as top3 results) are obtained from the multiple results: eczema, psoriasis, certain tinea. According to the result of top3, the clinical characteristics and phenotypes of the three diseases are obtained from a disease-associated knowledge base, the specified questions are presented for the diseases, and if the answer results of the patients are all yes, effective inquiry information is obtained. If the patient answers with no, it is possible that none of the top3 results of the first disease recognition hit a disease that hit a picture. The patient is asked to describe the disease (spoken and/or written) in a conventional inquiry manner. And combining effective inquiry information and description of conventional inquiry, adopting a natural language processing technology to extract inquiry characteristics, and adopting a multi-mode model to identify a second disease so as to obtain a second identification result. And combining the first recognition result and the second recognition result to obtain a recognition result with the highest probability as a final recognition result.
The above method is described below in connection with an application scenario. The patient gets the skin disease, and the mobile phone terminal shoots the image of the focus area. In the shooting process, guiding prompts (such as text prompts, popup prompts, frames for adding or identifying target acquisition areas and the like) are acquired according to the images so as to assist a patient to shoot clear and effective images. The system detects the image shot by the patient, and if the image meets the requirements, the image recognition is carried out; otherwise, prompting the patient to re-shoot until the image meeting the requirements is obtained.
After the background system obtains the image meeting the requirements, the image is subjected to feature extraction and first disease identification, and 3 first identification results and probability values corresponding to the first identification results are obtained. And acquiring the questioning data from the database according to the 3 first recognition results, and feeding back the questioning data to the mobile phone terminal.
The mobile phone terminal displays the questioning data, and the patient replies according to the questions in the questioning data, wherein part of the questions have options, namely, the patient selects the options for replying; another part of the questions requires a patient to respond to the description (text description and/or speech description). And the mobile phone terminal generates response data from the response of the patient and feeds the response data back to the background system.
The background system extracts inquiry features by adopting a natural language processing technology according to response data, inputs the inquiry features and the image features into a transformers module for fusion to obtain fusion features, and carries out second disease identification according to the inquiry features to obtain 3 identification results and probability values corresponding to the identification results. And combining the identification results of the first disease identification and the second disease identification, and acquiring the identification result corresponding to the maximum probability value as a final identification result. And feeding back the final identification result to the mobile phone terminal of the patient.
Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
The preferred embodiments of the present invention have been described above with reference to the accompanying drawings, and thus do not limit the scope of the claims of the present invention. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the present invention shall fall within the scope of the appended claims.

Claims (8)

1. A disease information identification method, characterized by comprising the steps of:
Acquiring an image to be identified, wherein the image to be identified contains disease information; the image to be identified is acquired by a user through an image acquisition device;
extracting features according to the image to be identified to obtain image features;
Carrying out first disease identification according to the image characteristics to obtain a plurality of first identification results;
Acquiring questioning data corresponding to the first recognition result from a database according to the first recognition result;
Acquiring response data corresponding to the question data;
natural language processing is carried out on the response data, and inquiry features are extracted;
inputting the inquiry feature and the image feature into transformers module for feature fusion to obtain fusion feature;
performing second disease identification according to the fusion characteristics to obtain a target identification result;
The disease information includes a focus area, and the acquiring an image to be identified includes:
acquiring an image containing the focus area according to a preset guidance prompt, wherein the guidance prompt is used for prompting the position of the focus area;
Detecting whether the image containing the focus area meets the acquisition requirement, if so, taking the image containing the focus area as an image to be identified, and if not, prompting to acquire the image containing the focus area again;
the questioning data comprises questioning data with preset options, and the response data comprises selection data generated by selecting the options;
The first disease identification is carried out according to the image characteristics to obtain a plurality of first identification results, including:
Carrying out first disease identification according to the image characteristics to obtain n first identification results and first probability values corresponding to the first identification results;
Adjusting the n first probability values according to the selection data to obtain n new first probability values;
and carrying out second disease identification according to the fusion characteristics to obtain a target identification result, wherein the second disease identification comprises the following steps:
Carrying out second disease identification according to the fusion characteristics to obtain m second identification results and second probability values corresponding to the second identification results;
and acquiring a recognition result corresponding to the highest probability value according to the n new first probability values and the m second probability values, and taking the recognition result as a target recognition result.
2. The disease information identification method according to claim 1, wherein the feature extraction is performed according to the image to be identified to obtain image features, including:
up-sampling and feature extraction are carried out on the image to be identified by adopting a depth network based on a feature pyramid, so that a plurality of image candidate frames are obtained;
And carrying out multi-scale processing on the obtained multiple image candidate frames to obtain image features.
3. The method for identifying disease information according to claim 2, wherein the step of identifying the first disease according to the image features to obtain a plurality of first identification results includes:
Inputting the image features into an attention mechanism module, and outputting fine features, wherein the attention mechanism module comprises a channel attention unit and a space attention unit; the channel attention unit is used for compressing the image features in the space dimension, and the space attention unit is used for compressing the image features in the channel dimension;
And processing the fine features sequentially through a spatial pyramid pooling layer, full connection and a Softmax classifier to obtain a plurality of first recognition results.
4. The method for identifying disease information according to claim 1, wherein the performing the second disease identification according to the fusion feature to obtain the target identification result comprises:
Performing second disease identification on the fusion features by adopting a multi-mode disease classification model; the multi-modal disease classification model includes transformers encoders and a class classifier;
The transformers coder is used for fusing the characteristics of the two modes of the picture and the text;
the category classifier is used for classifying the diseases according to the fusion characteristics and outputting recognition results.
5. The method for identifying disease information according to claim 1, wherein the performing the second disease identification according to the fusion feature to obtain the target identification result comprises:
performing second disease identification according to the fusion characteristics to obtain a target identification result;
and acquiring a final target recognition result according to the first recognition result and the target recognition result.
6. A disease information recognition system, comprising:
The image data acquisition module is used for acquiring an image to be identified, wherein the image to be identified contains disease information; the image to be identified is acquired by a user through an image acquisition device;
the first feature extraction module is used for carrying out feature extraction according to the image to be identified to obtain image features;
the first recognition module is used for carrying out first disease recognition according to the image characteristics to obtain a plurality of first recognition results;
The questioning data acquisition module is used for acquiring questioning data corresponding to the first recognition result from a database according to the first recognition result;
the response data acquisition module is used for acquiring response data corresponding to the question data;
The second feature extraction module is used for carrying out natural language processing on the response data and extracting to obtain inquiry features;
The feature fusion module is used for carrying out feature fusion on the inquiry features and the image features input to the transformers module to obtain fusion features;
The second recognition module is used for carrying out second disease recognition according to the fusion characteristics to obtain a target recognition result;
The disease information includes a focus area, and the acquiring an image to be identified includes:
acquiring an image containing the focus area according to a preset guidance prompt, wherein the guidance prompt is used for prompting the position of the focus area;
Detecting whether the image containing the focus area meets the acquisition requirement, if so, taking the image containing the focus area as an image to be identified, and if not, prompting to acquire the image containing the focus area again;
the questioning data comprises questioning data with preset options, and the response data comprises selection data generated by selecting the options;
The first disease identification is carried out according to the image characteristics to obtain a plurality of first identification results, including:
Carrying out first disease identification according to the image characteristics to obtain n first identification results and first probability values corresponding to the first identification results;
Adjusting the n first probability values according to the selection data to obtain n new first probability values;
and carrying out second disease identification according to the fusion characteristics to obtain a target identification result, wherein the second disease identification comprises the following steps:
Carrying out second disease identification according to the fusion characteristics to obtain m second identification results and second probability values corresponding to the second identification results;
and acquiring a recognition result corresponding to the highest probability value according to the n new first probability values and the m second probability values, and taking the recognition result as a target recognition result.
7. A disease information identifying apparatus, comprising:
At least one processor;
At least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement the method of any one of claims 1-5.
8. A storage medium having stored therein a processor executable program, which when executed by a processor is adapted to carry out the method of any one of claims 1-5.
CN202110744807.6A 2021-06-30 2021-06-30 Disease information identification method, system, device and storage medium Active CN113469049B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110744807.6A CN113469049B (en) 2021-06-30 2021-06-30 Disease information identification method, system, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110744807.6A CN113469049B (en) 2021-06-30 2021-06-30 Disease information identification method, system, device and storage medium

Publications (2)

Publication Number Publication Date
CN113469049A CN113469049A (en) 2021-10-01
CN113469049B true CN113469049B (en) 2024-05-10

Family

ID=77878268

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110744807.6A Active CN113469049B (en) 2021-06-30 2021-06-30 Disease information identification method, system, device and storage medium

Country Status (1)

Country Link
CN (1) CN113469049B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020034642A1 (en) * 2018-08-17 2020-02-20 齐鲁工业大学 Automatic medical question answering method and apparatus, storage medium, and electronic device
CN111444960A (en) * 2020-03-26 2020-07-24 上海交通大学 Skin disease image classification system based on multi-mode data input
CN111626293A (en) * 2020-05-21 2020-09-04 咪咕文化科技有限公司 Image text recognition method and device, electronic equipment and storage medium
CN111816301A (en) * 2020-07-07 2020-10-23 平安科技(深圳)有限公司 Medical inquiry assisting method, device, electronic equipment and medium
CN111832581A (en) * 2020-09-21 2020-10-27 平安科技(深圳)有限公司 Lung feature recognition method and device, computer equipment and storage medium
CN112070069A (en) * 2020-11-10 2020-12-11 支付宝(杭州)信息技术有限公司 Method and device for identifying remote sensing image
CN112233698A (en) * 2020-10-09 2021-01-15 中国平安人寿保险股份有限公司 Character emotion recognition method and device, terminal device and storage medium
CN112560796A (en) * 2020-12-29 2021-03-26 平安银行股份有限公司 Human body posture real-time detection method and device, computer equipment and storage medium
CN112784801A (en) * 2021-02-03 2021-05-11 紫东信息科技(苏州)有限公司 Text and picture-based bimodal gastric disease classification method and device
CN112801168A (en) * 2021-01-25 2021-05-14 江苏大学 Tumor image focal region prediction analysis method and system and terminal equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020034642A1 (en) * 2018-08-17 2020-02-20 齐鲁工业大学 Automatic medical question answering method and apparatus, storage medium, and electronic device
CN111444960A (en) * 2020-03-26 2020-07-24 上海交通大学 Skin disease image classification system based on multi-mode data input
CN111626293A (en) * 2020-05-21 2020-09-04 咪咕文化科技有限公司 Image text recognition method and device, electronic equipment and storage medium
CN111816301A (en) * 2020-07-07 2020-10-23 平安科技(深圳)有限公司 Medical inquiry assisting method, device, electronic equipment and medium
CN111832581A (en) * 2020-09-21 2020-10-27 平安科技(深圳)有限公司 Lung feature recognition method and device, computer equipment and storage medium
CN112233698A (en) * 2020-10-09 2021-01-15 中国平安人寿保险股份有限公司 Character emotion recognition method and device, terminal device and storage medium
CN112070069A (en) * 2020-11-10 2020-12-11 支付宝(杭州)信息技术有限公司 Method and device for identifying remote sensing image
CN112560796A (en) * 2020-12-29 2021-03-26 平安银行股份有限公司 Human body posture real-time detection method and device, computer equipment and storage medium
CN112801168A (en) * 2021-01-25 2021-05-14 江苏大学 Tumor image focal region prediction analysis method and system and terminal equipment
CN112784801A (en) * 2021-02-03 2021-05-11 紫东信息科技(苏州)有限公司 Text and picture-based bimodal gastric disease classification method and device

Also Published As

Publication number Publication date
CN113469049A (en) 2021-10-01

Similar Documents

Publication Publication Date Title
US11887311B2 (en) Method and apparatus for segmenting a medical image, and storage medium
US20220092882A1 (en) Living body detection method based on facial recognition, and electronic device and storage medium
CN108898579B (en) Image definition recognition method and device and storage medium
CN109919928B (en) Medical image detection method and device and storage medium
CN110689025B (en) Image recognition method, device and system and endoscope image recognition method and device
US20190392587A1 (en) System for predicting articulated object feature location
CN109346159B (en) Case image classification method, device, computer equipment and storage medium
CN112446476A (en) Neural network model compression method, device, storage medium and chip
CN108205684B (en) Image disambiguation method, device, storage medium and electronic equipment
CN114241548A (en) Small target detection algorithm based on improved YOLOv5
CN115132313A (en) Automatic generation method of medical image report based on attention mechanism
CN110059579B (en) Method and apparatus for in vivo testing, electronic device, and storage medium
CN112330624A (en) Medical image processing method and device
US20190340473A1 (en) Pattern recognition method of autoantibody immunofluorescence image
WO2019128564A1 (en) Focusing method, apparatus, storage medium, and electronic device
CN112818722A (en) Modular dynamically configurable living body face recognition system
CN113221771A (en) Living body face recognition method, living body face recognition device, living body face recognition equipment, storage medium and program product
CN113705361A (en) Method and device for detecting model in living body and electronic equipment
CN112446322A (en) Eyeball feature detection method, device, equipment and computer-readable storage medium
CN113722458A (en) Visual question answering processing method, device, computer readable medium and program product
CN115761356A (en) Image recognition method and device, electronic equipment and storage medium
CN114332993A (en) Face recognition method and device, electronic equipment and computer readable storage medium
CN113469049B (en) Disease information identification method, system, device and storage medium
CN115115552B (en) Image correction model training method, image correction device and computer equipment
CN111598144A (en) Training method and device of image recognition model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant