A multiple hold-out framework for Sparse Partial Least Squares

BACKGROUND Supervised classification machine learning algorithms may have limitations when studying brain diseases with heterogeneous populations, as the labels might be unreliable. More exploratory approaches, such as Sparse Partial Least Squares (SPLS), may provide insights into the brain's mechanisms by finding relationships between neuroimaging and clinical/demographic data. The identification of these relationships has the potential to improve the current understanding of disease mechanisms, refine clinical assessment tools, and stratify patients. SPLS finds multivariate associative effects in the data by computing pairs of sparse weight vectors, where each pair is used to remove its corresponding associative effect from the data by matrix deflation, before computing additional pairs. NEW METHOD We propose a novel SPLS framework which selects the adequate number of voxels and clinical variables to describe each associative effect, and tests their reliability by fitting the model to different splits of the data. As a proof of concept, the approach was applied to find associations between grey matter probability maps and individual items of the Mini-Mental State Examination (MMSE) in a clinical sample with various degrees of dementia. RESULTS The framework found two statistically significant associative effects between subsets of brain voxels and subsets of the questions/tasks. COMPARISON WITH EXISTING METHODS SPLS was compared with its non-sparse version (PLS). The use of projection deflation versus a classical PLS deflation was also tested in both PLS and SPLS. CONCLUSIONS SPLS outperformed PLS, finding statistically significant effects and providing higher correlation values in hold-out data. Moreover, projection deflation provided better results.

[1]  Brian B. Avants,et al.  Dementia induces correlated reductions in white matter integrity and cortical thickness: A multivariate neuroimaging study with sparse canonical correlation analysis , 2010, NeuroImage.

[2]  Daniel Rueckert,et al.  Hierarchical Statistical Shape Analysis and Prediction of Sub-cortical Brain Structures , 2007 .

[3]  T. Insel,et al.  Wesleyan University From the SelectedWorks of Charles A . Sanislow , Ph . D . 2010 Research Domain Criteria ( RDoC ) : Toward a New Classification Framework for Research on Mental Disorders , 2018 .

[4]  Brian B. Avants,et al.  Sparse canonical correlation analysis relates network-level atrophy to multivariate cognitive measures in a neurodegenerative population , 2014, NeuroImage.

[5]  D. A. Berry What Month is it , 1984 .

[6]  J. V. Haxby,et al.  Spatial Pattern Analysis of Functional Brain Images Using Partial Least Squares , 1996, NeuroImage.

[7]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[8]  Philippe Besse,et al.  Statistical Applications in Genetics and Molecular Biology A Sparse PLS for Variable Selection when Integrating Omics Data , 2011 .

[9]  Steven C. R. Williams,et al.  Describing the Brain in Autism in Five Dimensions—Magnetic Resonance Imaging-Assisted Diagnosis of Autism Spectrum Disorder Using a Multiparameter Classification Approach , 2010, The Journal of Neuroscience.

[10]  Daniela M Witten,et al.  Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data , 2009, Statistical applications in genetics and molecular biology.

[11]  Janaina Mourão Miranda,et al.  Classifying brain states and determining the discriminating activation patterns: Support Vector Machine on functional MRI data , 2005, NeuroImage.

[12]  George H. Sanders Today’s date , 2000 .

[13]  A. Zwinderman,et al.  Statistical Applications in Genetics and Molecular Biology Quantifying the Association between Gene Expressions and DNA-Markers by Penalized Canonical Correlation Analysis , 2011 .

[14]  Alexander Gammerman,et al.  Machine learning classification with confidence: Application of transductive conformal predictors to MRI-based diagnostic and prognostic markers in depression , 2011, NeuroImage.

[15]  Philippe Besse,et al.  Sparse canonical methods for biological data integration: application to a cross-platform study , 2009, BMC Bioinformatics.

[16]  Jean Fox O’Barr What Year Is It , 2008 .

[17]  Nick C Fox,et al.  Automatic classification of MR scans in Alzheimer's disease. , 2008, Brain : a journal of neurology.

[18]  David A. Seminowicz,et al.  Personality influences limbic-cortical interactions during sad mood induction , 2003, NeuroImage.

[19]  Lester W. Mackey,et al.  Deflation Methods for Sparse PCA , 2008, NIPS.

[20]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[21]  Antonio Moreno,et al.  Significant correlation between a set of genetic polymorphisms and a functional brain network revealed by feature selection and sparse Partial Least Squares , 2012, NeuroImage.

[22]  D. Tritchler,et al.  Sparse Canonical Correlation Analysis with Application to Genomic Data Integration , 2009, Statistical applications in genetics and molecular biology.

[23]  Anthony Randal McIntosh,et al.  Partial Least Squares (PLS) methods for neuroimaging: A tutorial and review , 2011, NeuroImage.

[24]  Janaina Mourão Miranda,et al.  Multivariate Effect Ranking via Adaptive Sparse PLS , 2015, 2015 International Workshop on Pattern Recognition in NeuroImaging.

[25]  Simon J Graham,et al.  An fMRI study investigating cognitive modulation of brain regions associated with emotional processing of visual stimuli , 2003, Neuropsychologia.

[26]  A R McIntosh,et al.  General and specific brain regions involved in encoding and retrieval of events: what, where, and when. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[27]  S. Folstein,et al.  "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician. , 1975, Journal of psychiatric research.

[28]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[29]  John Shawe-Taylor,et al.  Leveraging Clinical Data to Enhance Localization of Brain Atrophy , 2014, MLINI@NIPS.

[30]  Gereon R. Fink,et al.  fMRI Data Predict Individual Differences of Behavioral Effects of Nicotine: A Partial Least Square Analysis , 2007, Journal of Cognitive Neuroscience.

[31]  C. Jack,et al.  Rates of hippocampal atrophy correlate with change in clinical status in aging and AD , 2000, Neurology.

[32]  B.J. Lopresti,et al.  Quantitative and statistical analyses of PET imaging studies of amyloid deposition in humans , 2004, IEEE Symposium Conference Record Nuclear Science 2004..

[33]  B. Sahakian,et al.  Differing patterns of temporal atrophy in Alzheimer’s disease and semantic dementia , 2001, Neurology.

[34]  Vince D. Calhoun,et al.  Correspondence between fMRI and SNP data by group sparse canonical correlation analysis , 2014, Medical Image Anal..

[35]  Jacob A. Wegelin,et al.  A Survey of Partial Least Squares (PLS) Methods, with Emphasis on the Two-Block Case , 2000 .

[36]  N. Tzourio-Mazoyer,et al.  Automated Anatomical Labeling of Activations in SPM Using a Macroscopic Anatomical Parcellation of the MNI MRI Single-Subject Brain , 2002, NeuroImage.

[37]  Thomas E. Nichols,et al.  Nonparametric permutation tests for functional neuroimaging: A primer with examples , 2002, Human brain mapping.

[38]  M N Rossor,et al.  Patterns of temporal lobe atrophy in semantic dementia and Alzheimer's disease , 2001, Annals of neurology.

[39]  Daniel Rueckert,et al.  Hierarchical Statistical Shape Analysis and Prediction of Sub-Cortical Brain Structures , 2006, CVPR Workshops.

[40]  Andrew P. Holmes,et al.  Statistical issues in functional brain mapping. , 1994 .

[41]  Andrei Irimia,et al.  Multivariate morphological brain signatures predict patients with chronic abdominal pain from healthy control subjects , 2015, Pain.

[42]  A. Mechelli,et al.  Using Support Vector Machine to identify imaging biomarkers of neurological and psychiatric disease: A critical review , 2012, Neuroscience & Biobehavioral Reviews.

[43]  Anil Rao,et al.  Classification of Alzheimer's Disease from structural MRI using sparse logistic regression with optional spatial regularization , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[44]  F. Bookstein,et al.  A new statistical method for testing hypotheses of neuropsychological/MRI relationships in schizophrenia: partial least squares analysis , 2002, Schizophrenia Research.

[45]  Robert Sekuler,et al.  Corticolimbic Interactions Associated with Performance on a Short-Term Memory Task Are Modified by Age , 2000, The Journal of Neuroscience.