Exploring the Performance of Stacking Classifier to Predict Depression Among the Elderly

Geriatric depression is a disease prevailing in the elderly. It is characterized by typical symptoms of lower functioning, diminished interest in activities, insomnia or hypersomnia, fatigue or loss of energy and observable psycho motor agitation or retardation. Many studies exist with an aim to predict the geriatric depression from the perspective of healthcare informatics based on data mining analytics. However, there is no study emphasizing on the performance of stacking mechanism, which is one of ensemble classifiers. Therefore, this study is concerned with investigating the performance of stacking approach to predicting the geriatric depression-related dataset from the Korea National Health and Nutrition Examination Survey (KNHANES) ranging from 2010 to 2015. The KNHANES is a publicly available big dataset out of a national surveillance system aimed at assessing the health and nutritional status of Koreans since 1998. It is a nationally representative cross-sectional survey including approximately 10,000 individuals each year as a survey sample. By using 9,089 dataset regarding the geriatric depression in the Korean elderly (2010 ~2015), this study analyzed the changes in performance of the stacking mechanism when combining five classifiers (i.e., LR, DT, NN, SVM, NBN) in the base-level learner and meta-level learner. The performance of stacking mechanism measured in accuracy and AUC shows more robust pattern when the base-level learner is relatively simple (like LR, DT), and the meta-level learner is rather complex (like NBN, NN, SVM). To be specific, before the feature selection, the stacking performance was very competitive with accuracy 0.8624 when LR(SVM) indicating that the base-level learner is LR, and the meta-level learner is SVM. After the feature selection, the stacking performance was best with accuracy 0.8643 when DT (NN). With AUC, the similar results were obtained- i.e., LR(NN) with 0.8182 before the feature selection, and LR(NBN) with 0.8147 after the feature selection.

[1]  Bhekisipho Twala,et al.  Multiple classifier application to credit risk assessment , 2010, Expert Syst. Appl..

[2]  J. Unützer Late-Life Depression , 2007 .

[3]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[4]  João Gama,et al.  Combining Classifiers by Constructive Induction , 1998, ECML.

[5]  Kyungwon Oh,et al.  Data Resource Profile: The Korea National Health and Nutrition Examination Survey (KNHANES) , 2014, International journal of epidemiology.

[6]  Hakan Erdogan,et al.  Linear classifier combination and selection using group sparse regularization and hinge loss , 2013, Pattern Recognit. Lett..

[7]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[9]  Bernard Zenko,et al.  A comparison of stacking with meta decision trees to bagging, boosting, and stacking with other methods , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[10]  P. Modrego,et al.  Depression in patients with mild cognitive impairment increases the risk of developing dementia of Alzheimer type: a prospective cohort study. , 2004, Archives of neurology.

[11]  Naonori Ueda,et al.  Optimal Linear Combination of Neural Networks for Improving Classification Performance , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[13]  C. Gottfries Late life depression , 2009, European Archives of Psychiatry and Clinical Neuroscience.

[14]  George S Alexopoulos,et al.  Depression in the elderly , 2005, Lancet.

[15]  Christophe Mues,et al.  An experimental comparison of classification algorithms for imbalanced credit scoring data sets , 2012, Expert Syst. Appl..

[16]  Bernhard Schölkopf,et al.  Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..

[17]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[18]  M. Ozay,et al.  Performance analysis of stacked generalization classifiers , 2008, 2008 IEEE 16th Signal Processing, Communication and Applications Conference.

[19]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[20]  Hiroko H Dodge,et al.  Depressive symptoms and cognitive decline in late life: a prospective epidemiological study. , 2006, Archives of general psychiatry.

[21]  Georgios Paliouras,et al.  Combining Information Extraction Systems Using Voting and Stacked Generalization , 2005, J. Mach. Learn. Res..

[22]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[23]  Fatos T. Yarman-Vural,et al.  Automatic Image Annotation by Ensemble of Visual Descriptors , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[25]  Shweta Kharya,et al.  Using data mining techniques for diagnosis and prognosis of cancer disease , 2012, ArXiv.

[26]  Andreas Heinz,et al.  Cognitive decline in patients with dementia as a function of depression. , 2011, The American journal of geriatric psychiatry : official journal of the American Association for Geriatric Psychiatry.

[27]  Chris D. Nugent,et al.  Non-strict heterogeneous Stacking , 2007, Pattern Recognit. Lett..

[28]  Saso Dzeroski,et al.  Combining Classifiers with Meta Decision Trees , 2003, Machine Learning.

[29]  Fatos T. Yarman-Vural,et al.  A new decision fusion technique for image classification , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[30]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[31]  Christopher J. Merz,et al.  Using Correspondence Analysis to Combine Classifiers , 1999, Machine Learning.

[32]  Yu Cao,et al.  An integrated machine learning approach to stroke prediction , 2010, KDD.

[33]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[34]  P. Mottram,et al.  Diagnosis of depression in elderly patients , 2000 .

[35]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[36]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[37]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[38]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[39]  D. Blazer Depression in late life: review and commentary. , 2003, The journals of gerontology. Series A, Biological sciences and medical sciences.