Visualization and unsupervised predictive clustering of high-dimensional multimodal neuroimaging data

BACKGROUND Neuroimaging machine learning studies have largely utilized supervised algorithms - meaning they require both neuroimaging scan data and corresponding target variables (e.g. healthy vs. diseased) to be successfully 'trained' for a prediction task. Noticeably, this approach may not be optimal or possible when the global structure of the data is not well known and the researcher does not have an a priori model to fit the data. NEW METHOD We set out to investigate the utility of an unsupervised machine learning technique; t-distributed stochastic neighbour embedding (t-SNE) in identifying 'unseen' sample population patterns that may exist in high-dimensional neuroimaging data. Multimodal neuroimaging scans from 92 healthy subjects were pre-processed using atlas-based methods, integrated and input into the t-SNE algorithm. Patterns and clusters discovered by the algorithm were visualized using a 2D scatter plot and further analyzed using the K-means clustering algorithm. COMPARISON WITH EXISTING METHODS t-SNE was evaluated against classical principal component analysis. CONCLUSION Remarkably, based on unlabelled multimodal scan data, t-SNE separated study subjects into two very distinct clusters which corresponded to subjects' gender labels (cluster silhouette index value=0.79). The resulting clusters were used to develop an unsupervised minimum distance clustering model which identified 93.5% of subjects' gender. Notably, from a neuropsychiatric perspective this method may allow discovery of data-driven disease phenotypes or sub-types of treatment responders.

[1]  Khader M Hasan,et al.  Diffusion tensor metrics, T2 relaxation, and volumetry of the naturally aging human caudate nuclei in healthy young and middle‐aged adults: Possible implications for the neurobiology of human brain aging and disease , 2008, Magnetic resonance in medicine.

[2]  Stephen M Smith,et al.  Fast robust automated brain extraction , 2002, Human brain mapping.

[3]  S. Arver,et al.  Sex differences in cortical thickness and their possible genetic and sex hormonal underpinnings. , 2014, Cerebral cortex.

[4]  S. Rauch,et al.  Clinical application of brain imaging for the diagnosis of mood disorders: the current state of play , 2013, Molecular Psychiatry.

[5]  Francisco Azuaje,et al.  Cluster validation techniques for genome expression data , 2003, Signal Process..

[6]  Sean C. Bendall,et al.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia , 2013, Nature Biotechnology.

[7]  Khader M. Hasan,et al.  Prediction of individual subject's age across the human lifespan using diffusion tensor imaging: A machine learning approach , 2013, NeuroImage.

[8]  D. Hu,et al.  Identifying major depression using whole-brain functional connectivity: a multivariate pattern analysis. , 2012, Brain : a journal of neurology.

[9]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[10]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[11]  D. Louis Collins,et al.  Clustering of atlas-defined cortical regions based on relaxation times and proton density , 2009, NeuroImage.

[12]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[13]  Indika S. Walimuni,et al.  A computational framework to quantify tissue microstructural integrity using conventional MRI macrostructural volumetry , 2011, Comput. Biol. Medicine.

[14]  Ian T. Jolliffe,et al.  Discarding Variables in a Principal Component Analysis. I: Artificial Data , 1972 .

[15]  Erzsébet Merényi,et al.  A Validity Index for Prototype-Based Clustering of Data Sets With Complex Cluster Structures , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Ying Wang,et al.  High-dimensional Pattern Regression Using Machine Learning: from Medical Images to Continuous Clinical Variables However, Support Vector Regression Has Some Disadvantages That Become Especially , 2022 .

[17]  Anders M. Dale,et al.  An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest , 2006, NeuroImage.

[18]  Deepti R. Bathula,et al.  Distinct neuropsychological subgroups in typically developing youth inform heterogeneity in children with ADHD , 2012, Proceedings of the National Academy of Sciences.

[19]  Ming-Chang Chiang,et al.  Predicting White Matter Integrity from Multiple Common Genetic Variants , 2012, Neuropsychopharmacology.

[20]  David Coghill,et al.  Brainstem abnormalities in attention deficit hyperactivity disorder support high accuracy individual diagnostic classification , 2014, Human brain mapping.

[21]  Khader M Hasan,et al.  Multimodal Quantitative Magnetic Resonance Imaging of Thalamic Development and Aging across the Human Lifespan: Implications to Neurodegeneration in Multiple Sclerosis , 2011, The Journal of Neuroscience.

[22]  D. Linden The Challenges and Promise of Neuroimaging in Psychiatry , 2012, Neuron.

[23]  Josephine Barnes,et al.  Early-onset Alzheimer disease clinical variants , 2012, Neurology.

[24]  Patricio A. Vela,et al.  Pre-image Problem in Manifold Learning and Dimensional Reduction Methods , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[25]  Jonathan D. Power,et al.  Prediction of Individual Brain Maturity Using fMRI , 2010, Science.

[26]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[27]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[28]  Xiaohua Chen,et al.  Sex differences in regional gray matter in healthy individuals aged 44–48 years: A voxel-based morphometric study , 2007, NeuroImage.

[29]  S. Frangou A systems neuroscience perspective of schizophrenia and bipolar disorder. , 2014, Schizophrenia bulletin.

[30]  Klaus P. Ebmeier,et al.  Multi-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder. , 2012, Brain : a journal of neurology.

[31]  Satrajit S. Ghosh,et al.  Predicting treatment response in social anxiety disorder from functional magnetic resonance imaging. , 2012, JAMA psychiatry.

[32]  Eileen Luders,et al.  Decoding gender dimorphism of the human brain using multimodal anatomical and diffusion MRI data , 2013, NeuroImage.

[33]  Larry A. Kramer,et al.  Diffusion tensor imaging-based tissue segmentation: Validation and application to the developing child and adolescent brain , 2007, NeuroImage.

[34]  James Briscoe,et al.  An intuitive graphical visualization technique for the interrogation of transcriptome data , 2011, Nucleic acids research.

[35]  Janaina Mourão Miranda,et al.  Quantitative prediction of subjective pain intensity from whole-brain fMRI data using Gaussian processes , 2010, NeuroImage.

[36]  Janaina Mourão Miranda,et al.  Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach , 2010, NeuroImage.

[37]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[38]  R Casanova,et al.  Combining Graph and Machine Learning Methods to Analyze Differences in Functional Connectivity Across Sex , 2012, The open neuroimaging journal.

[39]  B. Mwangi,et al.  Prediction of pediatric bipolar disorder using neuroanatomical signatures of the amygdala , 2014, Bipolar disorders.

[40]  Daniel Rueckert,et al.  A Combined Manifold Learning Analysis of Shape and Appearance to Characterize Neonatal Brain Development , 2011, IEEE Transactions on Medical Imaging.

[41]  Shannon L. Risacher,et al.  Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning , 2012, Bioinform..

[42]  Indika S. Walimuni,et al.  Atlas-based investigation of human brain tissue microstructural spatial heterogeneity and interplay between transverse relaxation time and radial diffusivity , 2011, NeuroImage.

[43]  Brian B. Avants,et al.  Symmetric diffeomorphic image registration with cross-correlation: Evaluating automated labeling of elderly and neurodegenerative brain , 2008, Medical Image Anal..

[44]  Ivor W. Tsang,et al.  The pre-image problem in kernel methods , 2003, IEEE Transactions on Neural Networks.

[45]  M. Phillips,et al.  Distinguishing between Unipolar Depression and Bipolar Depression: Current and Future Clinical and Neuroimaging Perspectives , 2013, Biological Psychiatry.

[46]  Shuiwang Ji Computational genetic neuroanatomy of the developing mouse brain: dimensionality reduction, visualization, and clustering , 2013, BMC Bioinformatics.

[47]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[48]  Benson Mwangi,et al.  A Review of Feature Reduction Techniques in Neuroimaging , 2013, Neuroinformatics.

[49]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[50]  Indika S. Walimuni,et al.  Human brain atlas-based multimodal MRI analysis of volumetry, diffusimetry, relaxometry and lesion distribution in multiple sclerosis patients and healthy adult controls: Implications for understanding the pathogenesis of multiple sclerosis and consolidation of quantitative MRI results in MS , 2012, Journal of the Neurological Sciences.

[51]  B. Mwangi,et al.  Predictive classification of individual magnetic resonance imaging scans from children and adolescents , 2013, European Child & Adolescent Psychiatry.

[52]  A. Platzer Visualization of SNPs with t-SNE , 2013, PloS one.

[53]  Steven C. R. Williams,et al.  Describing the Brain in Autism in Five Dimensions—Magnetic Resonance Imaging-Assisted Diagnosis of Autism Spectrum Disorder Using a Multiparameter Classification Approach , 2010, The Journal of Neuroscience.

[54]  Nick C Fox,et al.  Automatic classification of MR scans in Alzheimer's disease. , 2008, Brain : a journal of neurology.

[55]  A. Villringer,et al.  Sexual dimorphism in the human brain: evidence from neuroimaging. , 2013, Magnetic resonance imaging.

[56]  T. Insel,et al.  Wesleyan University From the SelectedWorks of Charles A . Sanislow , Ph . D . 2010 Research Domain Criteria ( RDoC ) : Toward a New Classification Framework for Research on Mental Disorders , 2018 .

[57]  J. Buhmann,et al.  Dissecting psychiatric spectrum disorders by generative embedding☆☆☆ , 2013, NeuroImage: Clinical.

[58]  Khader M Hasan,et al.  Multi‐modal quantitative MRI investigation of brain tissue neurodegeneration in multiple sclerosis , 2012, Journal of magnetic resonance imaging : JMRI.