Machine learning on high dimensional shape data from subcortical brain surfaces: A comparison of feature selection and classification methods

Abstract High-dimensional shape descriptors (HDSD) are useful for modeling subcortical brain surface morphometry. Though HDSD is a useful basis for disease biomarkers, its high dimensionality requires careful treatment in its application to machine learning to mitigate the curse of dimensionality. We explored the use of HDSD feature sets by comparing the performance of two feature selection approaches, Regularized Random Forest (RRF) and LASSO, to no feature selection (NFS). Each feature set was applied to three classifiers: Random Forest (RF), Support Vector Machines (SVM) and Naive Bayes (NB). Paired feature-selection-classifier approaches were 10-fold cross-validated on two diagnostic contrasts: Alzheimer's disease and mild cognitive impairment, both relative to controls across varying sample sizes to evaluate their robustness. LASSO aided classification efficiency, however, RRF and NFS afforded more robust performances. Performance varied considerably by classifier with RF being most stable. We advise careful consideration of performance-efficiency tradeoffs in choosing feature selection strategies for HDSD.

[1]  Hernando Ombao,et al.  Penalized least squares regression methods and applications to neuroimaging , 2011, NeuroImage.

[2]  Martha Elizabeth Shenton,et al.  Laplace-Beltrami eigenvalues and topological features of eigenfunctions for statistical shape analysis , 2009, Comput. Aided Des..

[3]  Paul M. Thompson,et al.  Random forest classification of depression status based on subcortical brain morphometry following electroconvulsive therapy , 2015, 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI).

[4]  Michael I. Miller,et al.  Linking white matter and deep gray matter alterations in premanifest Huntington disease , 2016, NeuroImage: Clinical.

[5]  Vinoo Alluri,et al.  Capturing the musical brain with Lasso: Dynamic decoding of musical features from fMRI data , 2014, NeuroImage.

[6]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[7]  George C. Runger,et al.  Gene selection with guided regularized random forest , 2012, Pattern Recognit..

[8]  Daoqiang Zhang,et al.  Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease , 2012, NeuroImage.

[9]  Eric Westman,et al.  Combining MRI and CSF measures for classification of Alzheimer's disease and prediction of mild cognitive impairment conversion , 2012, NeuroImage.

[10]  Md Taufeeq Uddin,et al.  Human activity recognition from wearable sensors using extremely randomized trees , 2015, 2015 International Conference on Electrical Engineering and Information Communication Technology (ICEEICT).

[11]  Shantanu H. Joshi,et al.  Structural Plasticity of the Hippocampus and Amygdala Induced by Electroconvulsive Therapy in Major Depression , 2016, Biological Psychiatry.

[12]  George C. Runger,et al.  Feature selection via regularized trees , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[13]  Karl J. Friston,et al.  Identifying global anatomical differences: Deformation‐based morphometry , 1998 .

[14]  Paul M. Thompson,et al.  Volumetric and shape analyses of subcortical structures in United States service members with mild traumatic brain injury , 2016, Journal of Neurology.

[15]  Andrzej Endler,et al.  Application of Binary Classifiers to Filter Transactions on the Financial Market , 2014 .

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  et al.,et al.  Spatial patterns of brain atrophy in MCI patients, identified via high-dimensional pattern classification, predict subsequent cognitive decline , 2008, NeuroImage.

[18]  Peter A. Bandettini,et al.  Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images , 2012, NeuroImage.

[19]  Mark E. Schmidt,et al.  The Alzheimer's Disease Neuroimaging Initiative: A review of papers published since its inception , 2012, Alzheimer's & Dementia.

[20]  Dennis Velakoulis,et al.  Striatal Atrophy in the Behavioural Variant of Frontotemporal Dementia: Correlation with Diagnosis, Negative Symptoms and Disease Severity , 2015, PloS one.

[21]  Ramon Casanova,et al.  Classification of Structural MRI Images in Alzheimer's Disease from the Perspective of Ill-Posed Problems , 2012, PloS one.

[22]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[23]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[24]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[25]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[26]  Paul M. Thompson,et al.  Mapping abnormal subcortical brain morphometry in an elderly HIV + cohort , 2015, NeuroImage: Clinical.

[27]  Marie Chupin,et al.  Automatic classi fi cation of patients with Alzheimer ' s disease from structural MRI : A comparison of ten methods using the ADNI database , 2010 .

[28]  Norbert Schuff,et al.  Locally linear embedding (LLE) for MRI based Alzheimer's disease classification , 2013, NeuroImage.

[29]  Andrea Chincarini,et al.  Local MRI analysis approach in the diagnosis of early and prodromal Alzheimer's disease , 2011, NeuroImage.

[30]  Paul M. Thompson,et al.  A Family of Fast Spherical Registration Algorithms for Cortical Shapes , 2013, MBIA.

[31]  Qingyao Wu,et al.  Genome-wide association data classification and SNPs selection using two-stage quality-based Random Forests , 2015, BMC Genomics.

[32]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[33]  Shantanu H. Joshi,et al.  Methylphenidate modifies the motion of the circadian clock Lamotrigine in mood disorders and cocaine dependence Cortical glutamate in postpartum depression Effect of Electroconvulsive Therapy on Striatal Morphometry in Major Depressive Disorder , 2016 .

[34]  Paul M. Thompson,et al.  Shape matching with medial curves and 1-D group-wise registration , 2012, 2012 9th IEEE International Symposium on Biomedical Imaging (ISBI).

[35]  Paul M. Thompson,et al.  Medial demons registration localizes the degree of genetic influence over subcortical shape variability: An N= 1480 meta-analysis , 2015, 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI).