Using Make for Reproducible and Parallel Neuroimaging Workflow and Quality-Assurance

The contribution of this paper is to describe how we can program neuroimaging workflow using Make, a software development tool designed for describing how to build executables from source files. A makefile (or a file of instructions for Make) consists of a set of rules that create or update target files if they have not been modified since their dependencies were last modified. These rules are processed to create a directed acyclic dependency graph that allows multiple entry points from which to execute the workflow. We show that using Make we can achieve many of the features of more sophisticated neuroimaging pipeline systems, including reproducibility, parallelization, fault tolerance, and quality assurance reports. We suggest that Make permits a large step toward these features with only a modest increase in programming demands over shell scripts. This approach reduces the technical skill and time required to write, debug, and maintain neuroimaging workflows in a dynamic environment, where pipelines are often modified to accommodate new best practices or to study the effect of alternative preprocessing steps, and where the underlying packages change frequently. This paper has a comprehensive accompanying manual with lab practicals and examples (see Supplemental Materials) and all data, scripts, and makefiles necessary to run the practicals and examples are available in the “makepipelines” project at NITRC.

[1]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[2]  Tara M. Madhyastha,et al.  Cognitive Demands of the Workplace , 2012 .

[3]  Sven Rahmann,et al.  Snakemake--a scalable bioinformatics workflow engine. , 2012, Bioinformatics.

[4]  G. Zaharchuk,et al.  Recommended implementation of arterial spin-labeled perfusion MRI for clinical applications: A consensus of the ISMRM perfusion study group and the European consortium for ASL in dementia. , 2015, Magnetic resonance in medicine.

[5]  Anders M. Dale,et al.  An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest , 2006, NeuroImage.

[6]  Sven Rahmann,et al.  Genome analysis , 2022 .

[7]  Maria Luisa Gorno-Tempini,et al.  Frontal paralimbic network atrophy in very mild behavioral variant frontotemporal dementia. , 2008, Archives of neurology.

[8]  Marisa O. Hollinshead,et al.  Identification of common variants associated with human hippocampal and intracranial volumes , 2012, Nature Genetics.

[9]  Mark W. Woolrich,et al.  FSL , 2012, NeuroImage.

[10]  J. Gee,et al.  The Insight ToolKit image registration framework , 2014, Front. Neuroinform..

[11]  Roger D Peng,et al.  Reproducible research and Biostatistics. , 2009, Biostatistics.

[12]  Arthur W. Toga,et al.  Provenance in neuroimaging , 2008, NeuroImage.

[13]  Paul M. Thompson,et al.  Robust Brain Extraction Across Datasets and Comparison With Publicly Available Methods , 2011, IEEE Transactions on Medical Imaging.

[14]  Markus Diesmann,et al.  Practically Trivial Parallel Data Processing in a Neuroscience Laboratory , 2010 .

[15]  Stuart I. Feldman,et al.  Make — a program for maintaining computer programs , 1979, Softw. Pract. Exp..

[16]  Yaroslav O. Halchenko,et al.  Neuroscience Runs on GNU/Linux , 2011, Front. Neuroinform..

[17]  Satrajit S. Ghosh,et al.  Nipype: A Flexible, Lightweight and Extensible Neuroimaging Data Processing Framework in Python , 2011, Front. Neuroinform..

[18]  Ian Foster,et al.  Designing and building parallel programs , 1994 .

[19]  R. Peng Reproducible Research in Computational Science , 2011, Science.

[20]  Peter Miller Recursive Make Considered Harmful , 2008 .

[21]  Abraham Z. Snyder,et al.  Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion , 2012, NeuroImage.

[22]  Giovanni Coppola,et al.  Altered network connectivity in frontotemporal dementia with C9orf72 hexanucleotide repeat expansion. , 2014, Brain : a journal of neurology.

[23]  Joshua Carp,et al.  The secret lives of experiments: Methods reporting in the fMRI literature , 2012, NeuroImage.

[24]  Anders M. Dale,et al.  Sequence-independent segmentation of magnetic resonance images , 2004, NeuroImage.

[25]  Daniel S. Margulies,et al.  Standardizing Metadata in Brain Imaging , 2015 .

[26]  Nikos Makris,et al.  Automatically parcellating the human cerebral cortex. , 2004, Cerebral cortex.

[27]  Russell A. Poldrack,et al.  Guidelines for reporting an fMRI study , 2008, NeuroImage.

[28]  Yaroslav O. Halchenko,et al.  Open is Not Enough. Let's Take the Next Step: An Integrated, Community-Driven Computing Platform for Neuroscience , 2012, Front. Neuroinform..

[29]  Arthur W Toga,et al.  The LONI Pipeline Processing Environment , 2003, NeuroImage.

[30]  Yo Halchenko,et al.  NeuroDebian: an integrated, community-driven, free software platform for physiology , 2014 .