Towards Portable Large-Scale Image Processing with High-Performance Computing

High-throughput, large-scale medical image computing demands tight integration of high-performance computing (HPC) infrastructure for data storage, job distribution, and image processing. The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has constructed a large-scale image storage and processing infrastructure that is composed of (1) a large-scale image database using the eXtensible Neuroimaging Archive Toolkit (XNAT), (2) a content-aware job scheduling platform using the Distributed Automation for XNAT pipeline automation tool (DAX), and (3) a wide variety of encapsulated image processing pipelines called “spiders.” The VUIIS CCI medical image data storage and processing infrastructure have housed and processed nearly half-million medical image volumes with Vanderbilt Advanced Computing Center for Research and Education (ACCRE), which is the HPC facility at the Vanderbilt University. The initial deployment was natively deployed (i.e., direct installations on a bare-metal server) within the ACCRE hardware and software environments, which lead to issues of portability and sustainability. First, it could be laborious to deploy the entire VUIIS CCI medical image data storage and processing infrastructure to another HPC center with varying hardware infrastructure, library availability, and software permission policies. Second, the spiders were not developed in an isolated manner, which has led to software dependency issues during system upgrades or remote software installation. To address such issues, herein, we describe recent innovations using containerization techniques with XNAT/DAX which are used to isolate the VUIIS CCI medical image data storage and processing infrastructure from the underlying hardware and software environments. The newly presented XNAT/DAX solution has the following new features: (1) multi-level portability from system level to the application level, (2) flexible and dynamic software development and expansion, and (3) scalable spider deployment compatible with HPC clusters and local workstations.

[1]  Yuankai Huo,et al.  Gray Matter Surface Based Spatial Statistics (GS-BSS) in Diffusion Microstructure , 2017, MICCAI.

[2]  Karl J. Friston,et al.  Voxel-based morphometry of the human brain: Methods and applications , 2005 .

[3]  Vanessa Sochat,et al.  Singularity: Scientific containers for mobility of compute , 2017, PloS one.

[4]  Aniruddha S. Gokhale,et al.  A data colocation grid framework for big data medical image processing: backend design , 2018, Medical Imaging.

[5]  Lee Friedman,et al.  Report on a multicenter fMRI quality assurance protocol , 2006, Journal of magnetic resonance imaging : JMRI.

[6]  Shunxing Bao,et al.  Splenomegaly segmentation using global convolutional kernels and conditional generative adversarial networks , 2017, Medical Imaging.

[7]  Dirk Merkel,et al.  Docker: lightweight Linux containers for consistent development and deployment , 2014 .

[8]  Peter A. Calabresi,et al.  A topology-preserving approach to the segmentation of brain images with multiple sclerosis lesions , 2010, NeuroImage.

[9]  Bennett A Landman,et al.  Non-local statistical label fusion for multi-atlas segmentation , 2013, Medical Image Anal..

[10]  Mark W. Woolrich,et al.  Probabilistic diffusion tractography with multiple fibre orientations: What can we gain? , 2007, NeuroImage.

[11]  Daniel Rueckert,et al.  Multi-atlas based segmentation of brain images: Atlas selection and its effect on accuracy , 2009, NeuroImage.

[12]  Aaron Carass,et al.  Combining multi-atlas segmentation with brain surface estimation , 2016, SPIE Medical Imaging.

[13]  Massimiliano Izzo Biomedical Research and Integrated Biobanking: An Innovative Paradigm for Heterogeneous Data Management , 2016 .

[14]  Bennett A Landman,et al.  Correcting power and p-value calculations for bias in diffusion tensor imaging. , 2013, Magnetic resonance imaging.

[15]  Bennett A. Landman,et al.  Simultaneous Analysis and Quality Assurance for Diffusion Tensor Imaging , 2013, PloS one.

[16]  John H. Gilmore,et al.  Novel Local Shape-Adaptive Gyrification Index with Application to Brain Development , 2017, MICCAI.

[17]  Bennett A Landman,et al.  Integration of XNAT/PACS, DICOM, and research software for automated multi-modal image analysis , 2013, Medical Imaging.

[18]  Bennett A Landman,et al.  DAX - the next generation: towards one million processes on commodity hardware , 2017, Medical Imaging.

[19]  Benjamin Thyreau,et al.  PyXNAT: XNAT in Python , 2012, Front. Neuroinform..

[20]  Jessica A. Turner,et al.  COINS: An Innovative Informatics and Neuroimaging Tool Suite Built for Large Heterogeneous Datasets , 2011, Front. Neuroinform..

[21]  Andy B. Yoo,et al.  Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[22]  Mark W. Woolrich,et al.  Advances in functional and structural MR image analysis and implementation as FSL , 2004, NeuroImage.

[23]  Brian B. Avants,et al.  N4ITK: Improved N3 Bias Correction , 2010, IEEE Transactions on Medical Imaging.

[24]  Sébastien Ourselin,et al.  Reconstructing a 3D structure from serial histological sections , 2001, Image Vis. Comput..

[25]  Aniruddha S. Gokhale,et al.  Cloud Engineering Principles and Technology Enablers for Medical Image Processing-as-a-Service , 2017, 2017 IEEE International Conference on Cloud Engineering (IC2E).

[26]  Shunxing Bao,et al.  Theoretical and empirical comparison of big data image processing with Apache Hadoop and Sun Grid Engine , 2017, Medical Imaging.

[27]  P. Harris,et al.  Research electronic data capture (REDCap) - A metadata-driven methodology and workflow process for providing translational research informatics support , 2009, J. Biomed. Informatics.

[28]  Martin Styner,et al.  Cortical surface shape assessment via sulcal/gyral curve-based gyrification index , 2016, 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI).

[29]  Shunxing Bao,et al.  Improved stability of whole brain surface parcellation with multi-atlas segmentation , 2017, Medical Imaging.

[30]  Bennett A Landman,et al.  Efficient multi-atlas abdominal segmentation on clinically acquired CT with SIMPLE context learning , 2015, Medical Image Anal..

[31]  Gabriele Arnulfo,et al.  XTENS - an eXTensible Environment for NeuroScience , 2009, HealthGrid.

[32]  Ninon Burgos,et al.  New advances in the Clinica software platform for clinical neuroimaging studies , 2019 .

[33]  Xiao Han,et al.  CRUISE: Cortical reconstruction using implicit surface evolution , 2004, NeuroImage.

[34]  Benjamin N. Conrad,et al.  Vanderbilt University Institute of Imaging Science Center for Computational Imaging XNAT: A multimodal data archive and processing environment , 2016, NeuroImage.

[35]  Daniel Rueckert,et al.  Tract-based spatial statistics: Voxelwise analysis of multi-subject diffusion data , 2006, NeuroImage.

[36]  Yuankai Huo,et al.  Simultaneous total intracranial volume and posterior fossa volume estimation using multi‐atlas label fusion , 2017, Human brain mapping.

[37]  James E. Smith,et al.  The architecture of virtual machines , 2005, Computer.

[38]  Wolfgang Gentzsch,et al.  Sun Grid Engine: towards creating a compute power grid , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[39]  Daniel S. Marcus,et al.  The extensible neuroimaging archive toolkit , 2007, Neuroinformatics.

[40]  Aaron Carass,et al.  Consistent cortical reconstruction and multi-atlas brain segmentation , 2016, NeuroImage.

[41]  Bernhard Hemmer,et al.  An automated tool for detection of FLAIR-hyperintense white-matter lesions in Multiple Sclerosis , 2012, NeuroImage.

[42]  Garrick Staples,et al.  TORQUE resource manager , 2006, SC.