An empirical comparison of different approaches for combining multimodal neuroimaging data with support vector machine

In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no “magic bullet” for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis.

[1]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[2]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[3]  Paolo Fusar-Poli,et al.  Neurofunctional correlates of vulnerability to psychosis: A systematic review and meta-analysis , 2007, Neuroscience & Biobehavioral Reviews.

[4]  Gunnar Rätsch,et al.  The SHOGUN Machine Learning Toolbox , 2010, J. Mach. Learn. Res..

[5]  Daoqiang Zhang,et al.  Multimodal classification of Alzheimer's disease and mild cognitive impairment , 2011, NeuroImage.

[6]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[8]  M. Hotopf,et al.  Foetal origins of depression? A systematic review and meta-analysis of low birth weight and later depression , 2012, Psychological Medicine.

[9]  Matthew J. Kempton,et al.  Neuroanatomy of vulnerability to psychosis: A voxel-based meta-analysis , 2011, Neuroscience & Biobehavioral Reviews.

[10]  Xiaoying Wu,et al.  Structural and functional biomarkers of prodromal Alzheimer's disease: A high-dimensional pattern classification study , 2008, NeuroImage.

[11]  Derek K. Jones,et al.  RESTORE: Robust estimation of tensors by outlier rejection , 2005, Magnetic resonance in medicine.

[12]  Vikas Singh,et al.  Predictive markers for AD in a multi-modality framework: An analysis of MCI progression in the ADNI population , 2011, NeuroImage.

[13]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[14]  Guodong Guo,et al.  Face recognition by support vector machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[15]  A. Mechelli,et al.  Dysconnectivity in schizophrenia: Where are we now? , 2011, Neuroscience & Biobehavioral Reviews.

[16]  Vince D. Calhoun,et al.  Human Neuroscience , 2022 .

[17]  Hilleke E. Hulshoff Pol,et al.  Classification of schizophrenia patients and healthy controls from structural MRI scans in two large independent samples , 2012, NeuroImage.

[18]  Steven C. R. Williams,et al.  Using genetic, cognitive and multi-modal neuroimaging data to identify ultra-high-risk and first-episode psychosis at the individual level , 2013, Psychological Medicine.

[19]  John Ashburner,et al.  A fast diffeomorphic image registration algorithm , 2007, NeuroImage.

[20]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[21]  Paolo Fusar-Poli,et al.  Third-generation neuroimaging in early schizophrenia: translating research evidence into clinical utility. , 2012, The British journal of psychiatry : the journal of mental science.

[22]  A. Mechelli,et al.  Using Support Vector Machine to identify imaging biomarkers of neurological and psychiatric disease: A critical review , 2012, Neuroscience & Biobehavioral Reviews.

[23]  William Stafford Noble,et al.  Support vector machine , 2013 .

[24]  Karl J. Friston,et al.  Computing average shaped tissue probability templates , 2009, NeuroImage.

[25]  Daniel Rueckert,et al.  Tract-based spatial statistics: Voxelwise analysis of multi-subject diffusion data , 2006, NeuroImage.

[26]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[27]  Theodoros Damoulas,et al.  Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection , 2008, Bioinform..

[28]  Tom M. Mitchell,et al.  Machine learning classifiers and fMRI: A tutorial overview , 2009, NeuroImage.

[29]  Shan Suthaharan,et al.  Support Vector Machine , 2016 .

[30]  Karl J. Friston,et al.  Unified segmentation , 2005, NeuroImage.

[31]  Ryota Tomioka,et al.  Sparsity-accuracy trade-off in MKL , 2010, 1001.2615.

[32]  Jan Sijbers,et al.  ExploreDTI: a graphical toolbox for processing, analyzing, and visualizing diffusion MR data , 2009 .

[33]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[34]  William Stafford Noble,et al.  Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure , 2006, Bioinform..

[35]  Klaus-Robert Müller,et al.  Introduction to machine learning for brain imaging , 2011, NeuroImage.

[36]  L. de Haan,et al.  Diffusion tensor imaging in the early phase of schizophrenia: what have we learned? , 2010, Journal of psychiatric research.

[37]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.