Diabetic retinopathy risk prediction for fundus examination using sparse learning: a cross-sectional study

BackgroundBlindness due to diabetic retinopathy (DR) is the major disability in diabetic patients. Although early management has shown to prevent vision loss, diabetic patients have a low rate of routine ophthalmologic examination. Hence, we developed and validated sparse learning models with the aim of identifying the risk of DR in diabetic patients.MethodsHealth records from the Korea National Health and Nutrition Examination Surveys (KNHANES) V-1 were used. The prediction models for DR were constructed using data from 327 diabetic patients, and were validated internally on 163 patients in the KNHANES V-1. External validation was performed using 562 diabetic patients in the KNHANES V-2. The learning models, including ridge, elastic net, and LASSO, were compared to the traditional indicators of DR.ResultsConsidering the Bayesian information criterion, LASSO predicted DR most efficiently. In the internal and external validation, LASSO was significantly superior to the traditional indicators by calculating the area under the curve (AUC) of the receiver operating characteristic. LASSO showed an AUC of 0.81 and an accuracy of 73.6% in the internal validation, and an AUC of 0.82 and an accuracy of 75.2% in the external validation.ConclusionThe sparse learning model using LASSO was effective in analyzing the epidemiological underlying patterns of DR. This is the first study to develop a machine learning model to predict DR risk using health records. LASSO can be an excellent choice when both discriminative power and variable selection are important in the analysis of high-dimensional electronic health records.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  P. O S I T I O N S T A T E M E N T,et al.  Diagnosis and Classification of Diabetes Mellitus , 2011, Diabetes Care.

[3]  J. Tuomilehto,et al.  A review of the recent epidemiological data on the worldwide incidence of Type 1 (insulin-dependent) diabetes mellitus , 1993, Diabetologia.

[4]  T. Bek,et al.  Pulse pressure and diurnal blood pressure variation: association with micro- and macrovascular complications in type 2 diabetes. , 2002, American journal of hypertension.

[5]  Peter F. Sharp,et al.  Evaluation of a System for Automatic Detection of Diabetic Retinopathy From Color Fundus Photographs in a Large Population of Patients With Diabetes , 2008, Diabetes Care.

[6]  Salma Jamal,et al.  Predictive modeling of anti-malarial molecules inhibiting apicoplast formation , 2013, BMC Bioinformatics.

[7]  Mohammad Reza Maracy,et al.  A risk score development for diabetic retinopathy screening in Isfahan-Iran , 2009, Journal of research in medical sciences : the official journal of Isfahan University of Medical Sciences.

[8]  F. Ferris,et al.  Risk factors for high-risk proliferative diabetic retinopathy and severe visual loss: Early Treatment Diabetic Retinopathy Study Report #18. , 1998, Investigative ophthalmology & visual science.

[9]  D. Lackland,et al.  Diabetic retinopathy and serum lipoprotein subclasses in the DCCT/EDIC cohort. , 2004, Investigative ophthalmology & visual science.

[10]  B. Reiser,et al.  Estimation of the Youden Index and its Associated Cutoff Point , 2005, Biometrical journal. Biometrische Zeitschrift.

[11]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[12]  Beverley Balkau,et al.  Glycemic Thresholds for Diabetes-Specific Retinopathy , 2010, Diabetes Care.

[13]  Stephen C. Ekker,et al.  Mojo Hand, a TALEN design tool for genome editing applications , 2013, BMC Bioinformatics.

[14]  Moo K. Chung,et al.  Sparse Brain Network Recovery Under Compressed Sensing , 2011, IEEE Transactions on Medical Imaging.

[15]  B E Ainsworth,et al.  Compendium of physical activities: an update of activity codes and MET intensities. , 2000, Medicine and science in sports and exercise.

[16]  Tong Zhang,et al.  Identifying antigenicity-associated sites in highly pathogenic H5N1 influenza virus hemagglutinin by using sparse learning. , 2012, Journal of molecular biology.

[17]  E. Stefánsson,et al.  Individual risk assessment and information technology to optimise screening frequency for diabetic retinopathy , 2011, Diabetologia.

[18]  Qiang Cheng,et al.  A Sparse Learning Machine for High-Dimensional Data with Application to Microarray Gene Analysis , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[19]  Olga Golubnitschaja,et al.  Advanced Diabetes care: three levels of prediction, prevention & personalized treatment. , 2010, Current diabetes reviews.

[20]  P. Zimmet,et al.  Diagnosis and classification of diabetes mellitus , 2002 .

[21]  Bart Baesens,et al.  Decompositional Rule Extraction from Support Vector Machines by Active Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[22]  L. Aiello,et al.  Retinopathy in diabetes. , 2004, Diabetes care.

[23]  K. Park,et al.  Optimal HbA1c cutoff for detecting diabetic retinopathy , 2013, Acta Diabetologica.

[24]  R. Hanson,et al.  Changes in BMI and weight before and after the development of type 2 diabetes. , 2001, Diabetes care.

[25]  R. Holman,et al.  Glycemic control with diet, sulfonylurea, metformin, or insulin in patients with type 2 diabetes mellitus: progressive requirement for multiple therapies (UKPDS 49). UK Prospective Diabetes Study (UKPDS) Group. , 1999, JAMA.

[26]  Rury R. Holman,et al.  Glycemic Control with Diet, Sulfonylurea, Metformin, or Insulin in Patients with Type 2 Diabetes Mellitus: Progressive Requirement for Multiple Therapies (UKPDS 49) , 1999 .

[27]  T. Sano,et al.  [Diabetic retinopathy]. , 2001, Nihon rinsho. Japanese journal of clinical medicine.

[28]  Irene Pala,et al.  BMC Medical Informatics and Decision Making , 2014, BMC Medical Informatics and Decision Making.

[29]  H. Cordell,et al.  SNP Selection in Genome-Wide and Candidate Gene Studies via Penalized Logistic Regression , 2010, Genetic epidemiology.

[30]  Chung-Ho Hsieh,et al.  Novel solutions for an old disease: diagnosis of acute appendicitis with random forest, support vector machines, and artificial neural networks. , 2011, Surgery.

[31]  Emily Y Chew,et al.  Screening options for diabetic retinopathy. , 2006, Current opinion in ophthalmology.

[32]  Antonio Ciampi,et al.  Multiple Regression Methods Show Great Potential for Rare Variant Association Tests , 2012, PloS one.

[33]  Nathan Congdon,et al.  The Prevalence of Diabetic Retinopathy in the United States , 2002 .

[34]  Zhi Zheng,et al.  A Microalbuminuria Threshold to Predict the Risk for the Development of Diabetic Retinopathy in Type 2 Diabetes Mellitus Patients , 2012, PloS one.

[35]  K. H. Shin,et al.  Fundus Examination Rate in Diabetics and the Public Health Factors Associated With Fundus Examination Rate , 2009 .

[36]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[37]  Shannon L. Risacher,et al.  Identifying quantitative trait loci via group-sparse multitask regression and feature selection: an imaging genetics study of the ADNI cohort , 2012, Bioinform..

[38]  L. Aiello,et al.  Systemic considerations in the management of diabetic retinopathy. , 2001, American journal of ophthalmology.

[39]  T. Wong,et al.  Management of diabetic retinopathy: a systematic review. , 2007, JAMA.

[40]  Volker Roth,et al.  The generalized LASSO , 2004, IEEE Transactions on Neural Networks.

[41]  Kyungwon Oh,et al.  Plan and Operation of the 4th Korea National Health and Nutrition Examination Survey (KNHANES IV) , 2007 .

[42]  Abbas Heiat,et al.  Comparison of artificial neural network and regression models for estimating software development effort , 2002, Inf. Softw. Technol..

[43]  K. Yoshizawa,et al.  Diabetic Complications and Their Relationships to Risk Factors in a Japanese Population , 1984, Diabetes Care.

[44]  O. Kalter‐Leibovici,et al.  Clinical, socioeconomic, and lifestyle parameters associated with erectile dysfunction among diabetic men. , 2005, Diabetes care.

[45]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[46]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[47]  Igor Jurisica,et al.  Optimized application of penalized regression methods to diverse genomic data , 2011, Bioinform..

[48]  R. Tresserras,et al.  Epidemiology of renal involvement in type II diabetics (NIDDM) in Catalonia. The Catalan Diabetic Nephropathy Study Group. , 1996, Diabetes research and clinical practice.

[49]  S. Vrieze Model selection and psychological theory: a discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). , 2012, Psychological methods.

[50]  H. Schielzeth Simple means to improve the interpretability of regression coefficients , 2010 .

[51]  Ann L. Albright,et al.  Prevalence of diabetic retinopathy in the United States, 2005-2008. , 2010, JAMA.

[52]  R. Hamman,et al.  Prevalence and Risk Factors of Diabetic Retinopathy in Non-Hispanic Whites and Hispanics With NIDDM: San Luis Valley Diabetes Study , 1989, Diabetes.

[53]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .

[54]  K. Lim,et al.  Prevalence of Eye Diseases in South Korea: Data from the Korea National Health and Nutrition Examination Survey 2008-2009 , 2011, Korean journal of ophthalmology : KJO.

[55]  Paul Mitchell,et al.  Diabetic retinopathy screening and monitoring of early stage disease in general practice: design and methods. , 2012, Contemporary clinical trials.