CN114931356A - Retina structure extraction method, system and application for OCTA image - Google Patents

Retina structure extraction method, system and application for OCTA image Download PDF

Info

Publication number
CN114931356A
CN114931356A CN202210721487.7A CN202210721487A CN114931356A CN 114931356 A CN114931356 A CN 114931356A CN 202210721487 A CN202210721487 A CN 202210721487A CN 114931356 A CN114931356 A CN 114931356A
Authority
CN
China
Prior art keywords
features
image
feature
voting
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210721487.7A
Other languages
Chinese (zh)
Inventor
陈方胜
郝晋奎
赵一天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Institute of Material Technology and Engineering of CAS
Cixi Institute of Biomedical Engineering CIBE of CAS
Original Assignee
Ningbo Institute of Material Technology and Engineering of CAS
Cixi Institute of Biomedical Engineering CIBE of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Institute of Material Technology and Engineering of CAS, Cixi Institute of Biomedical Engineering CIBE of CAS filed Critical Ningbo Institute of Material Technology and Engineering of CAS
Priority to CN202210721487.7A priority Critical patent/CN114931356A/en
Publication of CN114931356A publication Critical patent/CN114931356A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
    • A61B3/102Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for optical coherence tomography [OCT]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
    • A61B3/103Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for determining refraction, e.g. refractometers, skiascopes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/766Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/817Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level by voting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10101Optical tomography; Optical coherence tomography [OCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Ophthalmology & Optometry (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Eye Examination Apparatus (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a retina structure extraction method, a retina structure extraction system and application for an OCTA image. The method for extracting the retina structure facing the OCTA image comprises the following steps: obtaining an OCTA image; extracting multi-scale fusion features corresponding to each sub-layer from a plurality of sub-layers of the OCTA image; extracting voting gate features from the plurality of different voting gate modules; detecting key points of the voting gate characteristics to obtain key point characteristics; and taking the key point features and the selectable partial voting gate features as the feature extraction result of the OCTA image. The retina structure extraction method provided by the invention utilizes rich sublayer information of the OCTA image to carry out comprehensive extraction, fully integrates the characteristics of different sublayers, is a brand-new end-to-end multi-task learning framework, jointly learns the characteristic image in the OCTA image, and provides an effective and accurate mode for quantitative analysis of retina blood vessel indexes.

Description

Retina structure extraction method, system and application for OCTA image
Technical Field
The invention relates to the technical field of image recognition, in particular to the field of eye optical coherence tomography blood vessel image recognition, and particularly relates to a retina structure extraction method and system for an OCTA image and application.
Background
Optical Coherence Tomography (OCTA) is a fast, non-invasive imaging technique built on Optical Coherence Tomography (OCT) platforms that can generate images containing functional information of retinal vessels and microvasculature. Quantitative index information of retinal indices obtained from OCTA images plays a crucial role in quantitative research and clinical decision of ocular and neurodegenerative diseases.
For example, the eyes of Alzheimer's Disease (AD) patients showed a significant decrease in retinal vascular density compared to healthy controls. In some retinal diseases, the size and density of connections of the Foveal Avascular Zone (FAZ) also differ significantly from those of healthy controls.
Based on the reasons, the automatic and accurate extraction of the characteristic images of the retinal structures from the OCTA has important significance for early diagnosis of retinal circulation-related diseases and assessment of disease progression.
However, the existing extraction work of the retinal feature images in the OCTA images only focuses on single-task learning, which means that if a plurality of structural indexes need to be quantized, a plurality of different networks need to be trained respectively. One possible solution in the prior art is to train a model using multi-task learning (MTL), which can perform multiple tasks simultaneously, rather than building a set of independent networks. However, current MTL work is specifically designed for natural images, limiting their applicability to OCTA images due to several challenges. Most importantly, the methods described above use a single two-dimensional image as input, and it is difficult to fully utilize the rich sub-layer information of the OCTA image.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a retina structure extraction method, a retina structure extraction system and application for an OCTA image.
In order to achieve the purpose, the technical scheme adopted by the invention comprises the following steps:
in a first aspect, the present invention provides a retinal structure extraction method for an oca image, including:
1) acquiring an OCTA image, the OCTA image comprising a plurality of sub-layers;
2) extracting a multi-scale fusion feature corresponding to each sub-layer from a plurality of sub-layers of the OCTA image;
3) extracting voting gate features from the plurality of multi-scale fusion features using a plurality of different voting gate modules;
4) detecting key points of the voting gate features to obtain key point features;
5) and taking the key point features and the selectable partial voting gate features as feature extraction results of the OCTA image.
In a second aspect, the present invention further provides an OCTA image-oriented retina structure extraction system, which includes a multitask feature extraction network based on a voting mechanism, where the multitask feature extraction network includes:
a feature extraction module, which comprises a plurality of feature extractors, wherein the plurality of feature extractors are respectively used for extracting multi-scale fusion features corresponding to each sub-layer from different input OCTA image sub-layers,
a plurality of independent task-specific voting gate modules for adaptively selecting the most important features for each task; and
a keypoint detection module comprising:
a first task element based on heat map regression for enabling the localization of key points,
and the second task unit is classified based on the region and is used for identifying the type of the key point.
In a third aspect, the present invention further provides a neural network training method, including:
acquiring a training image, and marking the training image to generate marking information;
performing feature extraction on the training image by using the retina structure extraction method to obtain a feature extraction result corresponding to the training image;
and training a neural network used by the retina structure extraction method according to the marking information and the feature extraction result.
In a fourth aspect, the present invention further provides an electronic device, which includes a memory and a processor, where the memory stores a computer program, and the processor executes the steps of the retinal structure extraction method described above when executing the computer program.
In a fifth aspect, the present invention also provides a readable storage medium storing a computer program which, when executed by a processor, implements the steps of the retinal structure extraction method described above.
Based on the technical scheme, compared with the prior art, the invention has the beneficial effects that at least:
the invention provides an OCTA retina structure extraction method, which utilizes rich sublayer information of an OCTA image to carry out comprehensive extraction, utilizes a voting mechanism to extract voting gate features from extracted multi-scale fusion features and identifies partial key point features according to the voting gate features, and finally fuses the key point features and the partial voting gate features as a final feature extraction result.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to enable those skilled in the art to more clearly understand the technical solutions of the present invention and to implement them according to the content of the description, the following description is made with reference to the preferred embodiments of the present invention and the detailed drawings.
Drawings
Fig. 1 is a schematic flowchart of a retinal structure extraction method for an OCTA image according to an exemplary embodiment of the present invention;
fig. 2 is a schematic diagram of an extraction process of a retina structure extraction method for an OCTA image according to an exemplary embodiment of the present invention;
fig. 3 is a schematic flowchart of a step of detecting key points of the voting gate feature in a retina structure extraction method for an OCTA image according to an exemplary embodiment of the present invention;
fig. 4 is a sub-layer image and a feature extraction result image used in a retinal structure extraction method for an OCTA image according to an exemplary embodiment of the present invention.
Detailed Description
In view of the deficiencies in the prior art, the inventors of the present invention have made extensive studies and extensive practices to provide technical solutions of the present invention. The technical solution, its implementation and principles, etc. will be further explained as follows.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein, and thus the scope of the present invention is not limited by the specific embodiments disclosed below.
Moreover, relational terms such as "first" and "second," and the like, may be used solely to distinguish one element or method step from another element or method step having the same name, without necessarily requiring or implying any actual such relationship or order between such elements or method steps.
Referring to fig. 1 and fig. 2, an OCTA image-oriented retina structure extraction method provided in an embodiment of the present invention includes the following steps:
1) an OCTA image is acquired, the OCTA image including a plurality of sub-layers.
2) Extracting a multi-scale fusion feature corresponding to each sub-layer from a plurality of sub-layers of the OCTA image.
3) Extracting voting gate features from the plurality of multi-scale fusion features using a plurality of different voting gate modules.
4) And detecting key points of the voting gate characteristics to obtain key point characteristics.
5) And taking the key point features and the selectable partial voting gate features as feature extraction results of the OCTA image.
In some embodiments, the sub-layers in step 1) may include at least a superficial vascular complex layer, a deep vascular complex layer, and a plexiform retinal layer.
In some embodiments, the keypoints in step 4) may comprise at least a vessel bifurcation and a vessel intersection.
In some embodiments, the feature extraction result in step 5) may include at least a retinal blood vessel image, a foveal avascular zone image, and a retinal blood vessel bifurcation/intersection image.
The above embodiments of the present invention provide an end-to-end multitask learning algorithm, which jointly learns Retinal Vessel (RV), Foveal Avascular Zone (FAZ) and retinal vessel bifurcation/intersection (RVJ) in an oca image by using images of a Superficial Vascular Complex (SVC), a Deep Vascular Complex (DVC) and a total plexus retina (IVC), thereby providing another effective way for quantitative analysis of retinal structure indexes and diagnosis of fundus diseases.
It should be noted that the extraction method of the OCTA retinal structure provided by the present invention only relates to the process of processing images according to the OCTA images and providing characteristic results, and does not relate to the subsequent diagnosis and treatment methods of diseases, and even does not relate to the method of invading the human body to obtain samples or data.
With continued reference to fig. 2, in some embodiments, step 2) may specifically include the steps of:
the plurality of sub-layers are input into a first convolution layer of a plurality of feature extractors, obtaining a plurality of first convolution features corresponding to each of the sub-layers.
And respectively inputting the first convolution features into the residual convolution layers of the feature extractors to obtain the multi-scale fusion features corresponding to each sub-layer.
In some embodiments, the weight parameters of a plurality of said feature extractors may be the same.
In some embodiments, the feature extractors may each be comprised of a ResNet-50 network, and the convolution size of the first convolution layer may be 3 x 3.
Based on the above embodiment, as an example, in some typical application cases, the feature extraction module may consist of three feature extractors ResNet-50, replacing the first layer of 7 × 7 convolution layers with 3 × 3 convolutions with the same fill to ensure that the output size of the voting gate module is consistent with the size of the input image. In the present invention, three feature extractors share weights in addition to the first convolution layer to limit the number of learnable parameters. Due to the independence of the different inputs and the first layer, the three feature extractors are still able to extract different features despite the use of a weight sharing strategy.
In some embodiments, step 3) may specifically include the steps of:
and splicing the plurality of first convolution features output by the first convolution layer to obtain a splicing feature.
Inputting the splicing characteristics into a plurality of independent voting gate modules corresponding to specific tasks to obtain voting gate information corresponding to the specific tasks; wherein the specific task is selected from one of a retinal vessel imaging task, a foveal avascular area imaging task, and a retinal vessel bifurcation/junction imaging task.
And extracting voting gate features from the multi-scale fusion features by using the voting gate information.
In some embodiments, the voting gate module may select features as the voting gate information on a plurality of sub-layers and a plurality of spatial locations according to its corresponding specific task, wherein the spatial locations refer to locations of the selected features on the feature image.
In some embodiments, the plurality of spatial locations may include at least a central region of the macula, a location of a vascular phase, and an edge of a blood vessel.
In some embodiments, the extracting of the voting gate feature from the multi-scale fusion feature by using the voting gate information may specifically be implemented by the following steps:
and multiplying the voting gate information with each multi-scale fusion feature respectively to obtain multiplication features, and summing a plurality of multiplication features corresponding to all the multi-scale fusion features to obtain comprehensive feature mapping corresponding to the specific task.
And obtaining the voting gate characteristics by utilizing the comprehensive characteristic mapping.
In the technical solution provided by the embodiment of the present invention, the task branch preferably includes 3 parts, which are task branches for RV, FAZ and RVJ, respectively. The task branches of the RV and the FAZ may adopt the same structure, and each consists of two convolution blocks of 3 × 3. RVJ since the task is complex, it is preferable that special task branches should be designed to improve accuracy.
Based on the above embodiments, as an example, in some typical application cases, three independent task-specific Voting Gate Modules (VGM) are included, which are composed of a plurality of 3 × 3 convolutional layers with Batch Normalization (BN) and ReLU activation functions, the last convolutional layer beingThe sigmoid operator is used to map features into a probabilistic form, with 3 channels available as weights to select features. VGM for each task ∈ { RV, FAZ, RVJ } takes as input the concatenation of the first level outputs of the three feature extractors, the corresponding output { G ∈ { G } task Is a learned voter that can select features from two levels. First is the feature of the different layers of the OCTA, which is different for each task considering the importance of the features obtained from the three enace images. Secondly, the features of different spatial positions of the feature extractor, because each task has attention to the position information, FAZ segmentation is concentrated on the central region of the macula lutea, the position of a blood vessel phase is more critical for the bifurcation/intersection point classification task, and the blood vessel segmentation also needs to pay more attention to the edge information of the blood vessel. Voting gates G at the time of obtaining each task task After the feature extraction, the multi-scale fusion features of the three feature extractors are subjected to feature extraction, and the feature extraction is carried out on the multi-scale fusion features of the three feature extractors i Is (i is e {1, 2, 3}) and { G } respectively task Multiplying (task is epsilon { RV, FAZ, RVJ }), and carrying out summation operation to obtain a comprehensive characteristic mapping { Y } of the corresponding task task }. This process can be expressed as:
Figure BDA0003710589220000061
wherein
Figure BDA0003710589220000062
Denoted as voting gate G task The ith channel of the channel is divided into a plurality of channels,
Figure BDA0003710589220000063
indicating multiplication by element. The task-specific features are then mapped { Y } task Sending the data into corresponding tasks to obtain final results. Therefore, the most important characteristics can be selected in an adaptive mode to achieve the optimal effect.
The invention provides a feature integration method based on voting for the first time, which can automatically select features from different layers according to task attributes, namely the features of different layers and the features of different spatial positions in an image, and effectively improves the effect of feature extraction.
With respect to improvements in other aspects of the invention, the examples of the invention provide the following embodiments:
in some embodiments, step 4) may specifically include the following steps:
the voting gate features are divided into a plurality of grids.
Predicted location information for key points in a grid is obtained from the grid using heat map regression.
Obtaining the predicted category information of the key points from the grids by using a region classification model; wherein the prediction category information is used to indicate that a keypoint in the mesh belongs to a vessel intersection or vessel bifurcation or no keypoint within the mesh.
And fusing the predicted position information and the predicted category information to obtain the key point characteristics.
In some embodiments, the region classification model may predict category confidence information of the keypoints in the mesh, and may determine whether the keypoints in the mesh belong to a vessel intersection point or a vessel bifurcation point or do not have any keypoints in the mesh according to whether the category confidence information is greater than a preset threshold. The category confidence information is data indicating a confidence level of each category of the keypoints in the mesh, for example, in an example, in a mesh, the intersection confidence of a certain mesh is 85%, the bifurcation confidence is 5%, and a preset threshold is 80%, it may be determined that the keypoints in the mesh are intersections, in another example, in a mesh, the intersection confidence of a certain mesh is 50%, the bifurcation confidence is 20%, and a preset threshold is 60%, it may be determined that the keypoints in the mesh are irrelevant, as a preferred scheme, when both confidences are greater than the preset threshold, the keypoint category with the highest confidence may be selected as the final keypoint category.
The challenge of bifurcation/junction detection and classification is twofold: 1) unlike common keypoint detection, the number of vessel landmarks in the image and their spatial distribution are unknown 2) the nodes are all small targets containing only a small number of precise pixels, and the bounding box-based target detection method has difficulty in achieving satisfactory performance in this task. In view of the above problem, the present invention proposes two branch task heads that combine heat map regression and region classification to detect bifurcation/intersection points, and in particular, this complex task can be divided into two simple tasks: accurate localization of all keypoints (i.e., bifurcation/intersection) is achieved using heat map regression, and the region classification branches are responsible for identifying the type of keypoint. The flow is shown in fig. 3.
Based on the above embodiments, as an example, in some typical application cases, the feature map for key point detection
Figure BDA0003710589220000071
First a convolution block is fed, which consists of two 3 x 3 conventional layers with BN and an activation function. The last convolutional layer with activation function obtains the positions of all key points through the heat map output of 1 channel. The other branch is also taken
Figure BDA0003710589220000072
The S × S meshes as input and output represent different regions of the image. For each mesh region, the region classification branch will predict whether it contains bifurcation and intersection points, or no keypoints. Since no prediction of the coordinates of the bounding box is needed, the model can focus on the prediction of region types, with better performance than bounding box based approaches. Meanwhile, the confidence degree of the model on the region containing the key points can be reflected by predicting the confidence degree score corresponding to each region, and the threshold value selection method can be used for threshold value selection of final processing. The present invention sets each region unit to 8 × 8, and for an image input as 304 × 304, the final prediction of the branch is a tensor of 38 × 38 × 4, and finally the results of the two branches are combined to obtain the final prediction result.
The invention provides a module combining heat map regression and region classification for bifurcation/intersection point detection and classification for the first time, and the trouble caused by the complexity of the microvasculature in an OCTA image to accurate positioning and classification can be effectively solved through the method.
In addition, the retinal structure extraction method provided by the embodiment of the invention is evaluated on three OCTA data sets, and the currently disclosed data set ROSE and an independent private data set are used for verification in an experimental stage, so that compared with other multitask methods, the algorithm provided by the embodiment of the invention is obviously reduced in model complexity: for example, the existing multitasking model, UberNet size is 41.37M, MTAN size is 44.79M, MTI-Net size is 94.29M, and the model size provided by the embodiment of the present invention is 34.73M. Meanwhile, in the aspect of the precision of RV and FAZ segmentation results, DICE and Accuracy are respectively improved by 5% and 2% compared with the most advanced method. The F1-Score was improved by 4% in classification of RVJ. Experimental results show that the method provided by the embodiment of the invention is superior to the existing single-target learning method and multi-task learning method. In a specific application case, the IVC, SVC, and DVC sub-layer images adopted by the retinal structure extraction method and the retinal structure feature images extracted by the method are as follows: RV, RVJ, FAZ are shown in FIG. 4.
The embodiment of the invention also provides an OCTA image-oriented retina structure extraction system, which comprises a multi-task feature extraction network based on a voting mechanism, wherein the multi-task feature extraction network comprises:
the feature extraction module comprises a plurality of feature extractors, and the plurality of feature extractors are respectively used for extracting multi-scale fusion features corresponding to each sublayer from OCTA image sublayers of different inputs.
A plurality of independent task-specific voting gate modules for adaptively selecting the most important voting gate features for each task.
And a key point detection module comprising: the system comprises a first task unit based on heat map regression and a second task unit based on regional classification, wherein the first task unit is used for realizing the positioning of key points, and the second task unit is used for identifying the types of the key points.
The embodiment of the invention also provides a neural network training method, which comprises the following steps:
acquiring a training image, and marking the training image to generate marking information;
performing feature extraction on the training image by using the retina structure extraction method in any one of the above embodiments to obtain a feature extraction result corresponding to the training image;
and training a neural network used by the retina structure extraction method according to the marking information and the feature extraction result.
An embodiment of the present invention further provides an electronic device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the retinal structure extraction method in any of the above embodiments when executing the computer program.
An embodiment of the present invention further provides a readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps of the retinal structure extraction method in any one of the above embodiments are implemented.
An embodiment of the present invention further provides an extraction apparatus for a retina structure oriented to an OCTA image, including:
an image acquisition module to acquire an OCTA image comprising a plurality of sub-layers. Wherein the sub-layers may include at least a superficial vascular complex layer, a deep vascular complex layer, and a holoplex retinal layer.
A feature extraction module, configured to extract a multi-scale fusion feature corresponding to each sub-layer from a plurality of sub-layers of the OCTA image.
A plurality of different voting gate modules for extracting voting gate features from the plurality of multi-scale fusion features.
And the key point detection module is used for detecting key points of the voting gate characteristics to obtain key point characteristics. Wherein the keypoints may include at least a vessel bifurcation and a vessel intersection.
And the extraction result module is used for taking the key point features and the selectable partial voting gate features as the feature extraction result of the OCTA image. Wherein the feature extraction result may include at least a retinal blood vessel image, a foveal avascular region image and a retinal blood vessel bifurcation/intersection image.
In some embodiments, the feature extraction module specifically comprises:
a first convolution unit configured to input the plurality of sub-layers into a first convolution layer of a plurality of feature extractors to obtain a plurality of first convolution features corresponding to each of the sub-layers.
And the residual convolution unit is used for respectively inputting the first convolution characteristics into residual convolution layers of the characteristic extractors to obtain the multi-scale fusion characteristics corresponding to each sub-layer.
In some embodiments, the voting gate module specifically comprises:
and the splicing characteristic unit is used for splicing the first convolution layer output first convolution characteristics to obtain splicing characteristics.
The voting gate unit is used for inputting the splicing characteristics into a plurality of independent voting gate modules corresponding to the specific tasks to obtain voting gate information corresponding to the specific tasks; wherein the specific task is selected from one of a retinal vessel imaging task, a foveal avascular area imaging task, and a retinal vessel bifurcation/junction imaging task. The voting gate module may select features on multiple sub-layers and multiple spatial locations as the voting gate information according to its corresponding specific task. A plurality of said spatial positions may be selected to include at least the central region of the macula, the position of the vascular phase and the edges of the blood vessels.
And the feature extraction unit is used for extracting the voting gate features from the multi-scale fusion features by utilizing the voting gate information. The step may specifically include: multiplying the voting gate information by each multi-scale fusion feature respectively to obtain multiplication features, and summing a plurality of multiplication features corresponding to all the multi-scale fusion features to obtain comprehensive feature mapping corresponding to the specific task; and obtaining the voting gate characteristics by utilizing the comprehensive characteristic mapping.
In some embodiments, the key point detection module may specifically include:
and the grid dividing unit is used for dividing the voting gate characteristics into a plurality of grids.
A heat map regression unit to obtain predicted location information of key points in a grid from the grid using heat map regression.
A category prediction unit for obtaining predicted category information of the keypoints from the mesh by using a region classification model; wherein the prediction category information is used to indicate that a keypoint in the mesh belongs to a vessel intersection or vessel bifurcation or no keypoint within the mesh. The region classification model can judge whether the key points in the grid belong to a blood vessel intersection point or a blood vessel bifurcation point or do not have any key point in the grid by predicting the category confidence information of the key points in the grid and according to whether the category confidence information is larger than a preset threshold value or not.
And the information fusion unit is used for fusing the predicted position information and the predicted type information to obtain the key point characteristics.
It should be understood that the above-mentioned embodiments are merely illustrative of the technical concepts and features of the present invention, which are intended to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and therefore, the protection scope of the present invention is not limited thereby. All equivalent changes and modifications made according to the spirit of the present invention should be covered in the protection scope of the present invention.

Claims (10)

1. A retina structure extraction method facing OCTA images is characterized by comprising the following steps:
1) acquiring an OCTA image, the OCTA image comprising a plurality of sub-layers;
2) extracting a multi-scale fusion feature corresponding to each sub-layer from a plurality of sub-layers of the OCTA image;
3) extracting voting gate features from the plurality of multi-scale fusion features using a plurality of different voting gate modules;
4) detecting key points of the voting gate features to obtain key point features;
5) and taking the key point features and the selectable partial voting gate features as feature extraction results of the OCTA image.
2. The retinal structure extraction method according to claim 1, wherein in step 1), the sub-layers include at least a superficial vascular complex layer, a deep vascular complex layer, and a periplex retinal layer;
and/or, in step 4), the key points comprise at least a vessel bifurcation point and a vessel intersection point;
and/or, in the step 5), the feature extraction result at least comprises a retinal blood vessel image, a foveal avascular zone image and a retinal blood vessel bifurcation/intersection point image.
3. The retinal structure extraction method according to claim 1 or 2, wherein the step 2) specifically includes:
inputting the plurality of sub-layers into a first convolution layer of a plurality of feature extractors, obtaining a plurality of first convolution features corresponding to each of the sub-layers;
respectively inputting the first convolution features into residual convolution layers of the feature extractors to obtain multi-scale fusion features corresponding to the sub-layers;
preferably, the weight parameters of the plurality of feature extractors are the same.
4. The retinal structure extraction method of claim 3, wherein the feature extractors are each comprised of a ResNet-50 network.
5. The retinal structure extraction method according to claim 3, wherein the step 3) specifically includes:
splicing a plurality of first convolution features output by the first convolution layer to obtain splicing features;
inputting the splicing characteristics into a plurality of independent voting gate modules corresponding to specific tasks to obtain voting gate information corresponding to the specific tasks; wherein the specific task is selected from one of a retinal vessel image task, a foveal avascular region image task and a retinal vessel bifurcation/junction image task;
extracting voting gate features from the multi-scale fusion features by using the voting gate information;
preferably, the voting gate module selects features as the voting gate information on a plurality of sub-layers and a plurality of spatial positions according to different specific tasks corresponding to the features;
preferably, the plurality of spatial positions includes at least a central region of the macula, a position of a vascular phase, and an edge of a blood vessel.
6. The method as claimed in claim 5, wherein the extracting voting gate features from the multi-scale fusion features by using the voting gate information specifically comprises:
multiplying the voting gate information by each multi-scale fusion feature respectively to obtain multiplication features, and summing a plurality of multiplication features corresponding to all the multi-scale fusion features to obtain comprehensive feature mapping corresponding to the specific task;
obtaining voting gate features by using the comprehensive feature mapping;
preferably, the comprehensive characteristic mapping is obtained by specifically adopting the following formula:
Figure FDA0003710589210000021
wherein, Y task The integrated feature map is represented by a representation of,
Figure FDA0003710589210000022
represents the voting gate information corresponding to the ith channel, F i The multi-scale fusion features are represented,
Figure FDA0003710589210000023
indicating multiplication by element.
7. The retinal structure extraction method according to claim 1 or 2, characterized in that step 4) specifically comprises:
dividing the voting gate features into a plurality of grids;
obtaining predicted location information for keypoints in a grid from the grid using heat map regression;
obtaining the predicted category information of the key points from the grids by using a region classification model; wherein the prediction category information is used for indicating that the key points in the mesh belong to a vessel intersection point or a vessel bifurcation point or no key point in the mesh;
fusing the predicted position information and the predicted category information to obtain the key point characteristics;
preferably, the region classification model predicts category confidence information of the key points in the mesh, and determines whether the key points in the mesh belong to a blood vessel intersection point or a blood vessel bifurcation point or no key point in the mesh according to whether the category confidence information is greater than a preset threshold.
8. An OCTA image-oriented retina structure extraction system is characterized by comprising a multitask feature extraction network based on a voting mechanism, wherein the multitask feature extraction network comprises:
a feature extraction module, including a plurality of feature extractors, respectively configured to extract multi-scale fusion features corresponding to each of the sublayers from different input OCTA image sublayers,
a plurality of independent task-specific voting gate modules for adaptively selecting the most important voting gate features for each task; and
a keypoint detection module comprising:
a first task element of heat map based regression for enabling the localization of key points,
and the second task unit is classified based on the region and is used for identifying the type of the key point.
9. A neural network training method, comprising:
acquiring a training image, and marking the training image to generate marking information;
performing feature extraction on the training image by using the retinal structure extraction method of any one of claims 1 to 7 to obtain a feature extraction result corresponding to the training image;
and training a neural network used by the retina structure extraction method according to the marking information and the feature extraction result.
10. An electronic device, characterized in that the electronic device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, implements the steps of the retinal structure extraction method of any one of claims 1-7.
CN202210721487.7A 2022-06-23 2022-06-23 Retina structure extraction method, system and application for OCTA image Pending CN114931356A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210721487.7A CN114931356A (en) 2022-06-23 2022-06-23 Retina structure extraction method, system and application for OCTA image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210721487.7A CN114931356A (en) 2022-06-23 2022-06-23 Retina structure extraction method, system and application for OCTA image

Publications (1)

Publication Number Publication Date
CN114931356A true CN114931356A (en) 2022-08-23

Family

ID=82868316

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210721487.7A Pending CN114931356A (en) 2022-06-23 2022-06-23 Retina structure extraction method, system and application for OCTA image

Country Status (1)

Country Link
CN (1) CN114931356A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115760807A (en) * 2022-11-24 2023-03-07 湖南至真明扬技术服务有限公司 Retinal fundus image registration method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115760807A (en) * 2022-11-24 2023-03-07 湖南至真明扬技术服务有限公司 Retinal fundus image registration method and system
CN115760807B (en) * 2022-11-24 2024-01-19 北京至真健康科技有限公司 Retina fundus image registration method and system

Similar Documents

Publication Publication Date Title
CN111199550B (en) Training method, segmentation method, device and storage medium of image segmentation network
CN111340819B (en) Image segmentation method, device and storage medium
CN110148142B (en) Training method, device and equipment of image segmentation model and storage medium
CN110689025B (en) Image recognition method, device and system and endoscope image recognition method and device
CN102947832B (en) The identities match of patient's record
CN112700434B (en) Medical image classification method and classification device thereof
CN114998210B (en) Retinopathy of prematurity detecting system based on deep learning target detection
CN113113130A (en) Tumor individualized diagnosis and treatment scheme recommendation method
CN111428070A (en) Ophthalmologic case retrieval method, ophthalmologic case retrieval device, ophthalmologic case retrieval server and storage medium
Sindhiya et al. A survey on genetic algorithm based feature selection for disease diagnosis system
CN111612756A (en) Coronary artery specificity calcification detection method and device
CN115719334A (en) Medical image evaluation method, device, equipment and medium based on artificial intelligence
CN114931356A (en) Retina structure extraction method, system and application for OCTA image
CN116703837B (en) MRI image-based rotator cuff injury intelligent identification method and device
Tu An integrated framework for image segmentation and perceptual grouping
CN115862112A (en) Target detection model for facial image acne curative effect evaluation
CN113283465B (en) Diffusion tensor imaging data analysis method and device
CN115170492A (en) Intelligent prediction and evaluation system for postoperative vision of cataract patient based on AI (artificial intelligence) technology
CN114120035A (en) Medical image recognition training method
CN114330484A (en) Method and system for classification and focus identification of diabetic retinopathy through weak supervision learning
Udayananda et al. An ensemble methods based machine learning approach for rice plant disease diagnosing
Saraswathi et al. Ensemble of pre-learned deep learning model and an optimized LSTM for Alopecia Areata classification
Ogiela et al. Towards new classes of cognitive vision systems
KR20210054140A (en) Medical image diagnosis assistance apparatus and method using a plurality of medical image diagnosis algorithm for endoscope images
Shaikh et al. Improved skin cancer detection using CNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination