Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.

Importance Deep learning is a family of computational methods that allow an algorithm to program itself by learning from a large set of examples that demonstrate the desired behavior, removing the need to specify rules explicitly. Application of these methods to medical imaging requires further assessment and validation. Objective To apply deep learning to create an algorithm for automated detection of diabetic retinopathy and diabetic macular edema in retinal fundus photographs. Design and Setting A specific type of neural network optimized for image classification called a deep convolutional neural network was trained using a retrospective development data set of 128 175 retinal images, which were graded 3 to 7 times for diabetic retinopathy, diabetic macular edema, and image gradability by a panel of 54 US licensed ophthalmologists and ophthalmology senior residents between May and December 2015. The resultant algorithm was validated in January and February 2016 using 2 separate data sets, both graded by at least 7 US board-certified ophthalmologists with high intragrader consistency. Exposure Deep learning-trained algorithm. Main Outcomes and Measures The sensitivity and specificity of the algorithm for detecting referable diabetic retinopathy (RDR), defined as moderate and worse diabetic retinopathy, referable diabetic macular edema, or both, were generated based on the reference standard of the majority decision of the ophthalmologist panel. The algorithm was evaluated at 2 operating points selected from the development set, one selected for high specificity and another for high sensitivity. Results The EyePACS-1 data set consisted of 9963 images from 4997 patients (mean age, 54.4 years; 62.2% women; prevalence of RDR, 683/8878 fully gradable images [7.8%]); the Messidor-2 data set had 1748 images from 874 patients (mean age, 57.6 years; 42.6% women; prevalence of RDR, 254/1745 fully gradable images [14.6%]). For detecting RDR, the algorithm had an area under the receiver operating curve of 0.991 (95% CI, 0.988-0.993) for EyePACS-1 and 0.990 (95% CI, 0.986-0.995) for Messidor-2. Using the first operating cut point with high specificity, for EyePACS-1, the sensitivity was 90.3% (95% CI, 87.5%-92.7%) and the specificity was 98.1% (95% CI, 97.8%-98.5%). For Messidor-2, the sensitivity was 87.0% (95% CI, 81.1%-91.0%) and the specificity was 98.5% (95% CI, 97.7%-99.1%). Using a second operating point with high sensitivity in the development set, for EyePACS-1 the sensitivity was 97.5% and specificity was 93.4% and for Messidor-2 the sensitivity was 96.1% and specificity was 93.9%. Conclusions and Relevance In this evaluation of retinal fundus photographs from adults with diabetes, an algorithm based on deep machine learning had high sensitivity and specificity for detecting referable diabetic retinopathy. Further research is necessary to determine the feasibility of applying this algorithm in the clinical setting and to determine whether use of the algorithm could lead to improved care and outcomes compared with current ophthalmologic assessment.

[1]  E. S. Pearson,et al.  THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL , 1934 .

[2]  J. Elmore,et al.  Variability in radiologists' interpretations of mammograms. , 1994, The New England journal of medicine.

[3]  Rich Caruana,et al.  Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.

[4]  G. Bresnick,et al.  A screening approach to the surveillance of patients with diabetes for the presence of vision-threatening retinopathy. , 2000, Ophthalmology.

[5]  Nathan Congdon,et al.  The Prevalence of Diabetic Retinopathy in the United States , 2002 .

[6]  G. Murthy,et al.  Screening for diabetic retinopathy by non-ophthalmologists: an effective public health tool. , 2003, Acta ophthalmologica Scandinavica.

[7]  J. Olson,et al.  The efficacy of automated “disease/no disease” grading for diabetic retinopathy in a systematic screening programme , 2007, British Journal of Ophthalmology.

[8]  Gwénolé Quellec,et al.  Optimal Wavelet Transform for the Detection of Microaneurysms in Retina Photographs , 2008, IEEE Transactions on Medical Imaging.

[9]  Rajiv Raman,et al.  Prevalence of diabetic retinopathy in India: Sankara Nethralaya Diabetic Retinopathy Epidemiology and Molecular Genetics Study report 2. , 2009, Ophthalmology.

[10]  Ann L. Albright,et al.  Prevalence of diabetic retinopathy in the United States, 2005-2008. , 2010, JAMA.

[11]  Gwénolé Quellec,et al.  A multiple-instance learning framework for diabetic retinopathy screening , 2012, Medical Image Anal..

[12]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[13]  J. Keeffe,et al.  Diabetic retinopathy management guidelines , 2012 .

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Kenneth W. Tobin,et al.  Exudate-based diabetic macular edema detection in fundus images using publicly available datasets , 2012, Medical Image Anal..

[16]  U. Rajendra Acharya,et al.  Computer-aided diagnosis of diabetic retinopathy: A review , 2013, Comput. Biol. Medicine.

[17]  G. Quellec,et al.  Automated analysis of retinal images for detection of referable diabetic retinopathy. , 2013, JAMA ophthalmology.

[18]  Laude,et al.  FEEDBACK ON A PUBLICLY DISTRIBUTED IMAGE DATABASE: THE MESSIDOR DATABASE , 2014 .

[19]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[20]  Chaithanya A Ramachandra,et al.  EyeArt: Automated, High-throughput, Image Analysis for Diabetic Retinopathy Screening , 2015 .

[21]  J. Elmore,et al.  Diagnostic concordance among pathologists interpreting breast biopsy specimens. , 2015, JAMA.

[22]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[23]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..