CN113902743A

CN113902743A - Method and device for identifying diabetic retinopathy based on cloud computing

Info

Publication number: CN113902743A
Application number: CN202111494551.4A
Authority: CN
Inventors: 靳雪; 郑健; 徐立璋; 刘国; 尹荣荣; 张倩; 洪姣; 邓科; 章书波; 胡汉平; 毛昱升; 姜兴民; 朱松林; 刘芷萱; 赵先洪; 李银谷
Original assignee: Wuhan Aiyanbang Technology Co ltd
Current assignee: Wuhan Aiyanbang Technology Co ltd
Priority date: 2021-12-08
Filing date: 2021-12-08
Publication date: 2022-01-07

Abstract

The invention discloses a method and a device for identifying diabetic retinopathy based on cloud computing. The method comprises the steps of firstly obtaining a retinal image marked with a fundus image pathological change area, and then inputting the retinal image marked with the fundus image pathological change area into a Mask RCNN-based retinal fundus image multi-pathological change classification network model for training until the model converges. And then, acquiring a fundus picture to be diagnosed, and finally inputting the fundus picture to be diagnosed into a trained retina fundus image multi-lesion classification network model based on Mask RCNN to obtain a diagnosis result. The invention realizes the diagnosis of the diabetic retinopathy based on the image recognition technology, and improves the diagnosis accuracy and the diagnosis speed of the diabetic retinopathy.

Description

Method and device for identifying diabetic retinopathy based on cloud computing

Technical Field

The invention relates to the technical field of computer vision of artificial intelligence and medical image computer processing, in particular to a method and a device for identifying diabetic retinopathy based on cloud computing.

Background

Diabetes mellitus, a common disease that endangers human health, has been of constant interest to researchers. Meanwhile, the increasing number of diabetics and the unbalanced distribution of medical resources form increasingly obvious sharp contradictions.

Therefore, there is a need for a method for identifying diabetic retinopathy based on computer processing technology, thereby improving the accuracy and speed of diagnosis of diabetic retinopathy.

Disclosure of Invention

The invention provides a method and a device for identifying diabetic retinopathy based on cloud computing, which can improve the diagnosis accuracy and the diagnosis speed of the diabetic retinopathy.

The invention provides a method for identifying diabetic retinopathy based on cloud computing, which comprises the following steps:

acquiring a retina image marked with a fundus image lesion area;

inputting the retina image marked with the fundus image lesion area into a Mask RCNN-based retina fundus image multi-lesion classification network model for training until the model converges;

acquiring a fundus picture to be diagnosed;

and inputting the fundus picture to be diagnosed into a trained retina fundus image multi-lesion classification network model based on Mask RCNN to obtain a diagnosis result.

Specifically, Resnet-101 is adopted in the skeleton network of the Mask RCNN-based retina fundus image multi-lesion classification network model, and a Non-local module is added in front of the last module of the Res4 part in the skeleton network;

the expression of the Non-local module is shown as follows:

where C (x) is a normalization function for normalization; and i is the index of the response output position, j is the index enumerating all possible positions; x represents input information, and y represents output information of the same size as x; f is a function for calculating the correlation, and mainly calculates the relation between each pixel and all the related pixels; and the univariate function g is used to calculate the eigenvalues of the input information at the position j.

Specifically, the method further comprises the following steps:

in the Mask RCNN-based retina fundus image multi-lesion classification network model, replacing original classification cross entropy loss with focal loss;

(ii) a Where FL represents the focal loss function, p_tRepresenting the probability that the prediction sample belongs to the correct class; alpha is alpha_tIs a weighting factor and gamma is a focus parameter.

Specifically, the step of inputting the fundus picture to be diagnosed into a trained Mask RCNN-based retina fundus image multi-lesion classification network model to obtain a diagnosis result includes:

and inputting the fundus picture to be diagnosed into the trained Mask RCNN-based retina fundus image multi-lesion classification network model to obtain processing time, maximum confidence of lesions, image id and a lesion segmentation result picture.

Specifically, after obtaining the lesion segmentation result map, the method further includes:

converting the lesion segmentation result map into a binary image;

and encoding the binary image by using a base64 library, decoding the binary image into an utf-8 format, and transmitting the binary image in a json format in a character string mode.

The invention also provides a device for identifying diabetic retinopathy based on cloud computing, which comprises:

the retina image acquisition module is used for acquiring a retina image marked with a fundus image lesion area;

the network model training module is used for inputting the retina image marked with the fundus image pathological change area into a Mask RCNN-based retina fundus image multi-pathological change classification network model for training until the model converges;

the fundus picture acquisition module is used for acquiring a fundus picture to be diagnosed;

and the diagnosis module is used for inputting the fundus picture to be diagnosed into a trained retina fundus image multi-lesion classification network model based on Mask RCNN to obtain a diagnosis result.

the expression of the Non-local module is shown as follows:

Specifically, the method further comprises the following steps:

Specifically, the diagnosis module is specifically configured to input the fundus image to be diagnosed into the trained Mask RCNN-based retina fundus image multi-lesion classification network model, so as to obtain processing time, a maximum confidence of a lesion, an image id, and a lesion segmentation result map.

Specifically, the method further comprises the following steps:

a result image conversion module for converting the lesion segmentation result image into a binary image;

and the result graph transmission module is used for encoding the binary image by using a base64 library, decoding the binary image into an utf-8 format and transmitting the binary image in a json format in a character string mode.

One or more technical schemes provided by the invention at least have the following technical effects or advantages:

the method comprises the steps of firstly obtaining a retinal image marked with a fundus image pathological change area, and then inputting the retinal image marked with the fundus image pathological change area into a Mask RCNN-based retinal fundus image multi-pathological change classification network model for training until the model converges. And then, acquiring a fundus picture to be diagnosed, and finally inputting the fundus picture to be diagnosed into a trained retina fundus image multi-lesion classification network model based on Mask RCNN to obtain a diagnosis result. The invention realizes the diagnosis of the diabetic retinopathy based on the image recognition technology, and improves the diagnosis accuracy and the diagnosis speed of the diabetic retinopathy.

Drawings

Fig. 1 is a flowchart of a method for identifying diabetic retinopathy based on cloud computing according to an embodiment of the present invention;

FIG. 2 is a block diagram of a cloud-based diabetic retinopathy recognition apparatus according to an embodiment of the present invention;

FIG. 3 is a block diagram of an auxiliary diagnostic system for diabetic retinopathy constructed in accordance with an embodiment of the present invention;

fig. 4 is a flowchart of the operation of the doctor client in the system for auxiliary diagnosis of diabetic retinopathy constructed according to the embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a method and a device for identifying diabetic retinopathy based on cloud computing, which can improve the diagnosis accuracy and the diagnosis speed of the diabetic retinopathy.

In order to achieve the technical effects, the technical scheme in the embodiment of the invention has the following general idea:

the auxiliary diagnosis system for the diabetic retinopathy through cloud computing provided by the embodiment of the invention comprises a network server (installed on an Ubuntu system) and a mobile small program client, wherein the network server is constructed by using a lightweight web frame flash in Python. Firstly, training a model locally; then, serializing the trained model, for example, using a pickle module; then, loading a model on a server, and providing a service interface to the outside by using a flash; and finally, when the data to be diagnosed are transmitted on line, directly using the trained model to carry out prediction to obtain a prediction result. The lesion blocks are regarded as examples to be detected, the boundary of the segmented lesion blocks can be taken as a target in the segmentation part, and the network is taken as the basis of a lesion detection network.

For better understanding of the above technical solutions, the following detailed descriptions will be provided in conjunction with the drawings and the detailed description of the embodiments.

Referring to fig. 1, the method for identifying diabetic retinopathy based on cloud computing according to the embodiment of the present invention includes:

step S110: acquiring a retina image marked with a fundus image lesion area;

specifically, before a retina image marked with a fundus image pathological change area is obtained, a medical treatment end firstly obtains the retina image and marks the fundus image pathological change area in the retina image;

this step is explained in detail:

the medical treatment end collects a certain number of retinal fundus images through a fundus camera, preprocesses the images, unifies the image size to 1024 x 1024 pixels, uses the expansion operation in the morphological processing, and adopts a circular area with the radius of 2 as a structural element to connect the surrounding small connected domains together, thereby forming a large target to improve the situation that the number of the targets in the images is excessive. And then, the processed images are delivered to a doctor for labeling to obtain the lesion type and the labeling label of the lesion area of each fundus color photograph image, and a training set, a verification set and a test set are generated by using the labeling data.

Step S120: inputting the retina image marked with the fundus image lesion area into a Mask RCNN-based retina fundus image multi-lesion classification network model for training until the model converges; in this embodiment, a training set is used to train a neural network model, the network parameters are uniformly initialized by Xavier, and are optimized by using Stochastic Gradient optimization (SGD), the learning rate is 0.001, the learning momentum is 0.9, and a weight attenuation regular term of 0.0001 is also added.

Specifically, Resnet-101 is adopted in a skeleton network of a retina fundus image multi-lesion classification network model based on Mask RCNN, a Res4 part in the skeleton network is a feature map with the size of 14 multiplied by 14, a Non-local module is added in front of the last module of a Res4 part in the skeleton network in consideration of balance of calculated amount and extracted information, and the receptive field is expanded through the Non-local module.

Specifically, the expression of Non-local module is shown as formula (1):

（1）

where C (x) is a normalization function for normalization; and i is the index of the response output location (meaning space, time or space, time) and j is the index enumerating all possible locations. x denotes input information and y denotes output information of the same size as x, and these information may be pictures and/or video, but in general are characteristic of them. And f is a function for calculating the correlation, and mainly calculates the relationship between each pixel and all the associated pixels, for example, the function may decrease with the distance, and the influence caused by reflecting the position farther away is smaller. And the univariate function g is used to calculate the eigenvalues of the input information at the position j.

Encapsulating the Non-local operation module in a form, designing the Non-local module by referring to a residual error network, wherein a formula is shown as a formula (2):

（2）

wherein yi represents the output in the formula (1), Wz represents a weight matrix used for Embedding operation on each yi, and + xi represents a residual structure, so as to add a Non-local module on the premise of not destroying the original model effect. The Embedding function is a form of f in the formula (1), is simply expanded on a Gaussian function, and calculates the similarity under the same Embedding space (Embedding space):

wherein the content of the first and second substances,

is a normalization function.

In order to avoid the occurrence of serious class imbalance problem when classifying and make the model focus more on classifying the wrong sample, also include:

in a retina fundus image multi-lesion classification network model based on Mask RCNN, original classification cross entropy loss is replaced by focal loss;

；

where FL represents the focal loss function, p_tRepresenting the probability that the prediction sample belongs to the correct class; alpha is alpha_tIs a weighting factor and gamma is a focus parameter. By adjusting the weighting factors𝛼_tAvoid serious class imbalance when classifying by adjusting focusing parameters𝛾The model is made more focused on misclassified samples.

And (3) evaluating the network performance:

the evaluation index based on the entire image is determined based on whether or not a lesion exists in the detection result, and therefore, a label indicating the presence or absence of a lesion exists in one image at this time. If the prediction result matches the label result, the prediction is correct, and thus such evaluation is essentially a two-class problem. The classification capability of the network model is judged by using operator Operating characteristics (ROC), and the confidence threshold is set to 0.99. When a lesion with the confidence coefficient higher than 0.99 exists in the model detection image, the fundus image is considered to have the lesion. The model with the highest AUC (Area size Under ROC Curve) on the test set was used as the detection model finally adopted.

Step S130: acquiring a fundus picture to be diagnosed;

step S140: and (3) inputting the fundus picture to be diagnosed into the trained retina fundus image multi-lesion classification network model based on Mask RCNN to obtain a diagnosis result.

Specifically explaining the steps, the fundus picture to be diagnosed is input into a trained retina fundus image multi-lesion classification network model based on Mask RCNN, and a diagnosis result is obtained, and the method comprises the following steps:

and (3) inputting the fundus picture to be diagnosed into a trained retina fundus image multi-lesion classification network model based on Mask RCNN to obtain processing time, maximum confidence of lesions, image id and a lesion segmentation result graph.

In order to facilitate the transmission of the result image in a json format, thereby improving the efficiency of the system in identifying the labeling result, after obtaining the lesion segmentation result map, the method further includes:

converting the lesion segmentation result image into a binary image;

the binary image is firstly encoded by a base64 library, then decoded into an utf-8 format and transmitted in a json format in a character string mode.

Referring to fig. 2, the apparatus for recognizing diabetic retinopathy based on cloud computing according to the embodiment of the present invention includes:

a retina image acquisition module 100, configured to acquire a retina image labeled with a fundus image lesion region;

the network model training module 200 is used for inputting the retina images marked with the fundus image lesion areas into a Mask RCNN-based retina fundus image multi-lesion classification network model for training until the model converges; in this embodiment, a training set is used to train a neural network model, the network parameters are uniformly initialized by Xavier, and are optimized by using Stochastic Gradient optimization (SGD), the learning rate is 0.001, the learning momentum is 0.9, and a weight attenuation regular term of 0.0001 is also added.

Specifically, the expression of Non-local module is shown as formula (1):

（1）

（2）

wherein the content of the first and second substances,

is a normalization function.

；

And (3) evaluating the network performance:

A fundus picture acquiring module 300 for acquiring a fundus picture to be diagnosed;

and the diagnosis module 400 is used for inputting the fundus picture to be diagnosed into the trained retina fundus image multi-lesion classification network model based on Mask RCNN to obtain a diagnosis result.

Specifically, the diagnosis module 400 is specifically configured to input a fundus image to be diagnosed into a trained Mask RCNN-based retina fundus image multi-lesion classification network model, so as to obtain a processing time, a maximum confidence of a lesion, an image id, and a lesion segmentation result map.

In order to facilitate the transmission of the result image in a json format, thereby improving the efficiency of the system in identifying the annotation result, the method further comprises the following steps:

the result image conversion module is used for converting the lesion segmentation result image into a binary image;

Referring to fig. 3, the auxiliary diagnosis system for diabetic retinopathy constructed according to the embodiment of the present invention includes a network server and a mobile applet client. The network server side comprises four modules: the system comprises a Mask RCNN-based retina fundus image multi-lesion classification model, a memory, a processor and a client interface. The mobile phone applet client interacts with a client interface of a network server, the client sends a post request to the network server, and the content of the request is a list meeting a json format, wherein the list comprises addresses of a plurality of background pictures to be processed, left and right eye labels and id of an image. The network server uses the processor to process the received post request and stores the image in the memory of the network server. The processor calls a previously trained Mask RCNN-based retina fundus image multi-lesion classification model to sequentially perform lesion detection on the image queue, a lesion segmentation result graph is stored in a memory of a network server, and a structured result description is returned to a client, wherein the result description comprises processing time, the maximum confidence coefficient of three lesions (diabetic early exudation, hemorrhage and cotton wool spot lesion), an image id and a lesion segmentation result graph. Wherein, the processing time is calculated from the time when the picture is uploaded to the network output result; in the lesion detection result output by the network, each lesion block has a confidence level, and the maximum confidence level of each lesion is taken as the confidence level of the lesion; the image id uses the id when the image is uploaded; and using the segmentation result graph output by the network as a result graph. In order to conveniently transmit the result image in a json format, the result image is firstly converted into a binary system, then the binary system image is firstly encoded by a base64 library, then the binary system image is decoded into an utf-8 format, and the character string format is put in the json format for transmission. And after processing, displaying the divided images of three lesions of exudation, bleeding and cotton wool spots and auxiliary diagnosis instructions on the mobile phone applet client.

Specifically, as shown in fig. 4, the operation of the applet client of the diagnosis assistance system is described as follows: firstly, a doctor enters a doctor end screening entrance of a small program and selects one of screening tasks, wherein the screening task takes a unit where a person to be screened is located as a label, the small program jumps to a screening detail page after the screening task is selected and displays all the persons to be screened of the screening task, the doctor selects the person to be screened to screen and uploads fundus images of two eyes of the person to be screened, the small program jumps to a result page and prompts a detection result to be detected, at the moment, the fundus images and related information are sent to a network service end to be processed, processed lesion segmentation images are displayed on the result page of the small program, and auxiliary diagnosis descriptions of the existence conditions of three types of lesions are displayed at the same time.

The embodiment of the invention uses the design of closed loop cycle on the whole system, and all diagnosis is finished in the cloud server end system, so that the mobile phone applet can be called at any time and also stored in the background server. Since a doctor needs to perform disease judgment and medical diagnosis for each fundus image, disease classification is required for each uploaded fundus image. The task just accords with the characteristics of neural network training, so the interpretation of the fundus picture by a doctor is used as the mark of the picture, and the picture is used as training data to be put into the next round of neural network training. Therefore, the method comprises the steps of marking pictures, processing the pictures, training the neural network and predicting the pictures, finally submitting the pictures to a doctor for diagnosis, using the diagnosed pictures as the training pictures of the neural network model next time, and performing iterative training on the model. The larger the training data amount is, the more accurate the prediction result of the neural network model is, and the design concept of the system is also met.

The patient can communicate with the doctor end through the mobile phone applet at any time when logging in the patient end, and can check the fundus picture and the diagnosis result of the patient. The doctor can take pictures of the patient through the handheld fundus camera, and the fundus camera can be bound with the corresponding doctor and directly uploads the taken fundus pictures to the system. And the deep learning neural network model is trained after the preprocessing and labeling of the training pictures are finished according to the designed network structure. After the system is brought online formally, a large number of fundus pictures and diagnosis results of the patient are uploaded to the system, which is equivalent to increasing the amount of training data. Therefore, the iterative training process can be carried out to further increase the diagnosis precision of the neural network model after setting every certain time or when the number of the stored new data is increased by a certain amount. When the doctor needs to process and predict the images, the corresponding button is clicked in the system, the cloud fundus image processing module is triggered to detect and divide the uploaded fundus images, and the processing result is transmitted back to the doctor client.

Technical effects

1. The method comprises the steps of firstly obtaining a retinal image marked with a fundus image pathological change area, and then inputting the retinal image marked with the fundus image pathological change area into a Mask RCNN-based retinal fundus image multi-pathological change classification network model for training until the model converges. And then, acquiring a fundus picture to be diagnosed, and finally inputting the fundus picture to be diagnosed into a trained retina fundus image multi-lesion classification network model based on Mask RCNN to obtain a diagnosis result. The embodiment of the invention realizes the diagnosis of the diabetic retinopathy based on the image recognition technology, and improves the diagnosis accuracy and the diagnosis speed of the diabetic retinopathy.

2. Resnet-101 is adopted in a skeleton network of a retina fundus image multi-lesion classification network model based on Mask RCNN, a Non-local module is added in front of the last module of a Res4 part in the skeleton network, the receptive field of the network model is expanded, and far pixels can be connected, so that the richness of semantic information is improved.

The embodiment of the invention designs a multi-lesion classification network model based on the Mask RCNN of the example segmentation network, and deploys the multi-lesion classification network model on the server, so that a doctor can upload fundus pictures of a patient through a mobile phone small program client, the server automatically detects the returned results, the multi-lesion classification network model can be well used in actual life and work, the workload of the doctor is reduced, the misdiagnosis rate is reduced, the diagnosis period is shortened, and the advantages can provide help for timely diagnosis. Meanwhile, a computer-aided diagnosis network system is established and formed, so that a plurality of cases are collected to facilitate mutual learning of doctors, and meanwhile, the diagnosis capability of the doctors and a computer can be continuously improved, so that the detection accuracy is improved, the blindness rate of the diabetes is reduced, and more importantly, the intelligent medical network system is beneficial to realizing the intelligent medical network of the whole society. Based on the combination of a mobile handheld fundus camera and a WeChat small program system, the portability and the mobility of the system are of great significance to the ophthalmic diagnosis of remote regions and basic medical organizations.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the invention without departing from the invention

With clear spirit and scope. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for identifying diabetic retinopathy based on cloud computing is characterized by comprising the following steps:

acquiring a retina image marked with a fundus image lesion area;

acquiring a fundus picture to be diagnosed;

2. The identification method according to claim 1, wherein the skeleton network of the Mask RCNN-based retinal fundus image multiple lesion classification network model adopts Resnet-101, and a Non-local module is added before the last module of the Res4 part in the skeleton network;

the expression of the Non-local module is shown as follows:

3. The identification method of claim 2, further comprising:

4. The identification method according to claim 1, wherein the inputting the fundus picture to be diagnosed into the trained Mask RCNN-based retina fundus image multi-lesion classification network model to obtain the diagnosis result comprises:

5. The identification method according to claim 4, further comprising, after obtaining the lesion segmentation result map:

converting the lesion segmentation result map into a binary image;

6. The utility model provides a diabetes retinopathy's recognition device based on high in the clouds calculation which characterized in that includes:

7. The identification apparatus according to claim 6, wherein the skeleton network based on Mask RCNN retinal fundus image multiple lesion classification network model adopts Resnet-101, and a Non-local module is added before the last module of Res4 part in the skeleton network;

the expression of the Non-local module is shown as follows:

8. The identification device of claim 7, further comprising:

(ii) a Wherein FL represents focal loss function, p_tRepresenting the probability that the prediction sample belongs to the correct class; alpha is alpha_tIs a weighting factor and gamma is a focus parameter.

9. The identification apparatus according to claim 6, wherein the diagnosis module is specifically configured to input the fundus image to be diagnosed into the trained Mask RCNN-based retina fundus image multi-lesion classification network model, so as to obtain a processing time, a maximum confidence of a lesion, an image id, and a lesion segmentation result map.

10. The identification device of claim 9, further comprising: