Alzheimer-type dementia prediction by sparse logistic regression using claim data

This study aimed to predict the risk of Alzheimer-type dementia for persons aged over 75 years old without receiving long-term care services using regularly collected claim data. A refined dataset including 48,123 persons was prepared from claim data of health insurance and long-term care insurance in a large city in the metropolitan area in Japan. The utilized features include the age and sex of subjects, 502 diseases based on ICD-10 diagnosis codes, and 107 prescription drugs based on therapeutic classes. The most important challenge in this work was feature selection form a large number of features. We adopted sparse logistic regression models with L0 regularization (SLR-L0) and L1 regularization (SLR-L1) as classification models based on machine learning. These regularizations enable feature selection by estimating sparse solution of non-zero coefficients in the model optimization. Predictions were performed by integrating 100 predictors trained by bootstrap samples. As a result, the area under the ROC curves (AUCs) were 0.663 for SLR-L0 and 0.660 for SLR-L1. These performances were similar, however, the average numbers of selected features were 13 out of a total of 611 for SLR-L0 and 253 for SLR-R1. The results indicate that SLR-L1 tended to include less useful features, whereas SLR-L0 narrowed down influential features. SLR-L0 might be more useful than SLR-L1 for practical use or the discussion of risk factors with medical experts.

[1]  Hong-Woo Chun,et al.  Longitudinal Study-Based Dementia Prediction for Public Health , 2017, International journal of environmental research and public health.

[2]  Honglak Lee,et al.  Efficient L1 Regularized Logistic Regression , 2006, AAAI.

[3]  E. Philippou,et al.  Mediterranean Diet, Cognitive Function, and Dementia: A Systematic Review of the Evidence. , 2016, Advances in nutrition.

[4]  Richard Mayeux,et al.  A summary risk score for the prediction of Alzheimer disease in elderly persons. , 2010, Archives of neurology.

[5]  H. Une,et al.  A statistical analysis of 'rule-out' diagnoses in outpatient health insurance claims in Japan. , 2011, Journal of evaluation in clinical practice.

[6]  Haewon Byeon,et al.  A Prediction Model for Mild Cognitive Impairment Using Random Forests , 2015 .

[7]  Yingshi Zhang,et al.  Does music therapy enhance behavioral and cognitive function in elderly dementia patients? A systematic review and meta-analysis , 2017, Ageing Research Reviews.

[8]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[9]  R. Tibshirani,et al.  Strong rules for discarding predictors in lasso‐type problems , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[10]  K. Anstey,et al.  Development of a New Method for Assessing Global Risk of Alzheimer’s Disease for Use in Population Health Approaches to Prevention , 2013, Prevention Science.

[11]  U. Rajendra Acharya,et al.  Automated diagnosis of celiac disease using DWT and nonlinear features with video capsule endoscopy images , 2019, Future Gener. Comput. Syst..

[12]  Yutaro Yamaguchi,et al.  Piecewise sparse linear classification via factorized asymptotic bayesian inference , 2016 .

[13]  Satoshi Morinaga,et al.  Fully-Automatic Bayesian Piecewise Sparse Linear Models , 2014, AISTATS.

[14]  K. Walters,et al.  Predicting dementia risk in primary care: development and validation of the Dementia Risk Score using routinely collected data , 2016, BMC Medicine.

[15]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[17]  W. Montgomery,et al.  Epidemiology, associated burden, and current clinical practice for the diagnosis and management of Alzheimer’s disease in Japan , 2017, ClinicoEconomics and outcomes research : CEOR.

[18]  Hilkka Soininen,et al.  Risk score for the prediction of dementia risk in 20 years among middle aged people: a longitudinal, population-based study , 2006, The Lancet Neurology.

[19]  Geert Jan Biessels,et al.  Midlife risk score for the prediction of dementia four decades later , 2014, Alzheimer's & Dementia.

[20]  U Rajendra Acharya,et al.  Automated Detection of Alzheimer’s Disease Using Brain MRI Images– A Study with Various Feature Extraction Techniques , 2019, Journal of Medical Systems.

[21]  K. Yaffe,et al.  The projected effect of risk factor reduction on Alzheimer's disease prevalence , 2011, The Lancet Neurology.

[22]  C. Brayne,et al.  Dementia risk prediction in the population: are screening models accurate? , 2010, Nature Reviews Neurology.

[23]  Sudha Seshadri,et al.  Development and validation of a brief dementia screening indicator for primary care , 2014, Alzheimer's & Dementia.

[24]  U. Rajendra Acharya,et al.  Automated detection of schizophrenia using nonlinear signal processing methods , 2019, Artif. Intell. Medicine.

[25]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[26]  C. Jack,et al.  Predicting the risk of mild cognitive impairment in the Mayo Clinic Study of Aging , 2015, Neurology.

[27]  Kristine Yaffe,et al.  Potential for primary prevention of Alzheimer's disease: an analysis of population-based data , 2014, The Lancet Neurology.

[28]  D. Forbes,et al.  Exercise programs for people with dementia , 2014, The Cochrane database of systematic reviews.

[29]  E. Perfetto,et al.  Predicting Diagnosis of Alzheimer’s Disease and Related Dementias Using Administrative Claims , 2018, Journal of managed care & specialty pharmacy.

[30]  P. Visser,et al.  Current Developments in Dementia Risk Prediction Modelling: An Updated Systematic Review , 2015, PloS one.

[31]  W. Katon,et al.  Risk score for prediction of 10 year dementia risk in individuals with type 2 diabetes: a cohort study. , 2013, The lancet. Diabetes & endocrinology.

[32]  Anjan Gudigar,et al.  Brain pathology identification using computer aided diagnostic tool: A systematic review , 2019, Comput. Methods Programs Biomed..

[33]  U. Rajendra Acharya,et al.  Automatic detection of ischemic stroke using higher order spectra features in brain MRI images , 2019, Cognitive Systems Research.

[34]  J. Weuve,et al.  2016 Alzheimer's disease facts and figures , 2016 .