Reproducibility of neuroimaging analyses across operating systems

Neuroimaging pipelines are known to generate different results depending on the computing platform where they are compiled and executed. We quantify these differences for brain tissue classification, fMRI analysis, and cortical thickness (CT) extraction, using three of the main neuroimaging packages (FSL, Freesurfer and CIVET) and different versions of GNU/Linux. We also identify some causes of these differences using library and system call interception. We find that these packages use mathematical functions based on single-precision floating-point arithmetic whose implementations in operating systems continue to evolve. While these differences have little or no impact on simple analysis pipelines such as brain extraction and cortical tissue classification, their accumulation creates important differences in longer pipelines such as subcortical tissue classification, fMRI analysis, and cortical thickness extraction. With FSL, most Dice coefficients between subcortical classifications obtained on different operating systems remain above 0.9, but values as low as 0.59 are observed. Independent component analyses (ICA) of fMRI data differ between operating systems in one third of the tested subjects, due to differences in motion correction. With Freesurfer and CIVET, in some brain regions we find an effect of build or operating system on cortical thickness. A first step to correct these reproducibility issues would be to use more precise representations of floating-point numbers in the critical sections of the pipelines. The numerical stability of pipelines should also be reviewed.

[1]  Tristan Glatard,et al.  Controlling the Deployment of Virtual Machines on Clusters and Clouds for Scientific Computing in CBRAIN , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[2]  Stephen M. Smith,et al.  Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm , 2001, IEEE Transactions on Medical Imaging.

[3]  J Mazziotta,et al.  A probabilistic atlas and reference system for the human brain: International Consortium for Brain Mapping (ICBM). , 2001, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[4]  Tom Minka,et al.  Automatic Choice of Dimensionality for PCA , 2000, NIPS.

[5]  KJ Worsley,et al.  SurfStat: A Matlab toolbox for the statistical analysis of univariate and multivariate surface and volumetric data using linear mixed effects models and random field theory , 2009, NeuroImage.

[6]  Tristan Glatard,et al.  CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research , 2014, Front. Neuroinform..

[7]  Bruce Fischl,et al.  FreeSurfer , 2012, NeuroImage.

[8]  Yaroslav O. Halchenko,et al.  Neuroscience Runs on GNU/Linux , 2011, Front. Neuroinform..

[9]  Stephen M. Smith,et al.  Probabilistic independent component analysis for functional magnetic resonance imaging , 2004, IEEE Transactions on Medical Imaging.

[10]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[11]  Rong Xu,et al.  Segmentation of Brain MRI , 2012 .

[12]  Mark W. Woolrich,et al.  FSL , 2012, NeuroImage.

[13]  Jonathan M. Borwein,et al.  High-precision computation: Mathematical physics and dynamics , 2010, Appl. Math. Comput..

[14]  Stephen M. Smith,et al.  A Bayesian model of shape and appearance for subcortical brain segmentation , 2011, NeuroImage.

[15]  Ron Mengelers,et al.  The Effects of FreeSurfer Version, Workstation Type, and Macintosh Operating System Version on Anatomical Volume and Cortical Thickness Measurements , 2012, PloS one.

[16]  Yaroslav O. Halchenko,et al.  Open is Not Enough. Let's Take the Next Step: An Integrated, Community-Driven Computing Platform for Neuroscience , 2012, Front. Neuroinform..

[17]  Stephen M Smith,et al.  Fast robust automated brain extraction , 2002, Human brain mapping.