Parallelization and High-Performance Computing Enables Automated Statistical Inference of Multi-scale Models.

Mechanistic understanding of multi-scale biological processes, such as cell proliferation in a changing biological tissue, is readily facilitated by computational models. While tools exist to construct and simulate multi-scale models, the statistical inference of the unknown model parameters remains an open problem. Here, we present and benchmark a parallel approximate Bayesian computation sequential Monte Carlo (pABC SMC) algorithm, tailored for high-performance computing clusters. pABC SMC is fully automated and returns reliable parameter estimates and confidence intervals. By running the pABC SMC algorithm for ∼106 hr, we parameterize multi-scale models that accurately describe quantitative growth curves and histological data obtained in vivo from individual tumor spheroid growth in media droplets. The models capture the hybrid deterministic-stochastic behaviors of 105-106 of cells growing in a 3D dynamically changing nutrient environment. The pABC SMC algorithm reliably converges to a consistent set of parameters. Our study demonstrates a proof of principle for robust, data-driven modeling of multi-scale biological systems and the feasibility of multi-scale model parameterization through statistical inference.

[1]  D. Noble Modeling the Heart--from Genes to Cells to the Whole Organ , 2002, Science.

[2]  N. Trayanova Whole-heart modeling: applications to cardiac electrophysiology and electromechanics. , 2011, Circulation research.

[3]  P. Swain,et al.  Stochastic Gene Expression in a Single Cell , 2002, Science.

[4]  Michael P. H. Stumpf,et al.  Simulation-based model selection for dynamical systems in systems and population biology , 2009, Bioinform..

[5]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[6]  M. L. Martins,et al.  Multiscale models for biological systems , 2010 .

[7]  Abbas Shirinifard,et al.  Multi-scale modeling of tissues using CompuCell3D. , 2012, Methods in cell biology.

[8]  Lincoln Stein,et al.  Reactome: a database of reactions, pathways and biological processes , 2010, Nucleic Acids Res..

[9]  Fabian J. Theis,et al.  Approximate Bayesian Computation for Stochastic Single-Cell Time-Lapse Data Using Multivariate Test Statistics , 2015, CMSB.

[10]  Erika Cule,et al.  ABC-SysBio—approximate Bayesian computation in Python with GPU support , 2010, Bioinform..

[11]  Nick Jagiella,et al.  Inferring Growth Control Mechanisms in Growing Multi-cellular Spheroids of NSCLC Cells from Spatial-Temporal Image Data , 2016, PLoS Comput. Biol..

[12]  Hossein Tavana,et al.  Optimization of Aqueous Biphasic Tumor Spheroid Microtechnology for Anti-cancer Drug Testing in 3D Culture , 2014, Cellular and Molecular Bioengineering.

[13]  S. Filippi,et al.  Optimizing threshold-schedules for sequential approximate Bayesian computation: applications to molecular systems , 2013, Statistical applications in genetics and molecular biology.

[14]  Fabian J Theis,et al.  High-dimensional Bayesian parameter estimation: case study for a model of JAK2/STAT5 signaling. , 2013, Mathematical biosciences.

[15]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[16]  D. L. Sean McElwain,et al.  Interpreting scratch assays using pair density dynamics and approximate Bayesian computation , 2014, Open Biology.

[17]  Xizhou Feng,et al.  Parallel algorithms for Bayesian phylogenetic inference , 2003, J. Parallel Distributed Comput..

[18]  Matthias Hermes,et al.  Prediction and validation of cell alignment along microvessels as order principle to restore tissue architecture in liver regeneration , 2010, Proceedings of the National Academy of Sciences.

[19]  Jun S. Liu,et al.  Metropolized independent sampling with comparisons to rejection sampling and importance sampling , 1996, Stat. Comput..

[20]  Franck Jabot,et al.  EasyABC: performing efficient approximate Bayesian computation sampling schemes using R , 2013 .

[21]  Jonathan R. Karr,et al.  A Whole-Cell Computational Model Predicts Phenotype from Genotype , 2012, Cell.

[22]  Phil McMinn,et al.  A multiobjective optimisation approach for the dynamic inference and refinement of agent-based model specifications , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[23]  Mark M. Tanaka,et al.  Sequential Monte Carlo without likelihoods , 2007, Proceedings of the National Academy of Sciences.

[24]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[25]  V. Quaranta,et al.  Integrative mathematical oncology , 2008, Nature Reviews Cancer.

[26]  T. Bartol,et al.  Monte Carlo Methods for Simulating Realistic Synaptic Microphysiology Using MCell , 2000 .

[27]  Martin A. Nowak,et al.  A spatial model predicts that dispersal and cell turnover limit intratumour heterogeneity , 2015, Nature.

[28]  Fabian J. Theis,et al.  Data-driven modelling of biological multi-scale processes , 2015, 1506.06392.

[29]  L M Loew,et al.  A general computational framework for modeling cellular structure and function. , 1997, Biophysical journal.

[30]  Giovanni Samaey,et al.  Equation-free multiscale computation: algorithms and applications. , 2009, Annual review of physical chemistry.

[31]  Michael P. H. Stumpf,et al.  From qualitative data to quantitative models: analysis of the phage shock protein stress response in Escherichia coli , 2011, BMC Systems Biology.

[32]  T R Pieber,et al.  A Generic Integrated Physiologically based Whole-body Model of the Glucose-Insulin-Glucagon Regulatory System , 2013, CPT: pharmacometrics & systems pharmacology.

[33]  Fabian J Theis,et al.  Lessons Learned from Quantitative Dynamical Modeling in Systems Biology , 2013, PloS one.

[34]  D. Balding,et al.  Statistical Applications in Genetics and Molecular Biology On Optimal Selection of Summary Statistics for Approximate Bayesian Computation , 2011 .

[35]  Christian P Robert,et al.  Lack of confidence in approximate Bayesian computation model choice , 2011, Proceedings of the National Academy of Sciences.

[36]  Gabriele Lillacci,et al.  The signal within the noise: efficient inference of stochastic gene regulation models using fluorescence histograms and stochastic simulations , 2013, Bioinform..

[37]  Bjørn Fredrik Nielsen,et al.  Computing Ischemic Regions in the Heart With the Bidomain Model—First Steps Towards Validation , 2013, IEEE Transactions on Medical Imaging.

[38]  Simon Tavaré,et al.  Integrating Approximate Bayesian Computation with Complex Agent-Based Models for Cancer Research , 2010, COMPSTAT.

[39]  P. Mendes,et al.  Multi-scale modelling and simulation in systems biology. , 2011, Integrative biology : quantitative biosciences from nano to macro.

[40]  Jukka Intosalmi,et al.  Data-driven mechanistic analysis method to reveal dynamically evolving regulatory networks , 2016, Bioinform..

[41]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[42]  D. Noble,et al.  A model for human ventricular tissue. , 2004, American journal of physiology. Heart and circulatory physiology.

[43]  Lani F. Wu,et al.  Cellular Heterogeneity: Do Differences Make a Difference? , 2010, Cell.

[44]  M. Elowitz,et al.  Functional roles for noise in genetic circuits , 2010, Nature.

[45]  Jun S. Liu,et al.  Sequential Imputations and Bayesian Missing Data Problems , 1994 .

[46]  R. Kwapiszewski,et al.  A microfluidic-based platform for tumour spheroid culture, monitoring and drug screening. , 2014, Lab on a chip.

[47]  Daniela M. Romano,et al.  High performance cellular level agent-based simulation with FLAME for the GPU , 2010, Briefings Bioinform..

[48]  Glazier,et al.  Simulation of biological cell sorting using a two-dimensional extended Potts model. , 1992, Physical review letters.

[49]  Walter de Back,et al.  Morpheus: a user-friendly modeling environment for multiscale and multicellular systems biology , 2014, Bioinform..

[50]  Xin Ming,et al.  Multicellular Tumor Spheroids as a Model for Assessing Delivery of Oligonucleotides in Three Dimensions , 2014, Molecular therapy. Nucleic acids.

[51]  Julien Cornebise,et al.  On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo , 2011, Statistical applications in genetics and molecular biology.

[52]  J.-M. Marin,et al.  Relevant statistics for Bayesian model choice , 2011, 1110.4700.

[53]  Pierre L'Ecuyer,et al.  TestU01: A C library for empirical testing of random number generators , 2006, TOMS.

[54]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[55]  Michael P. H. Stumpf,et al.  Maximizing the Information Content of Experiments in Systems Biology , 2013, PLoS Comput. Biol..

[56]  Gernot Schaller,et al.  Multicellular tumor spheroid in an off-lattice Voronoi-Delaunay cell model. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[57]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[58]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[59]  Matthias Reuss,et al.  Stochastic simulation of signal transduction: impact of the cellular architecture on diffusion. , 2009, Biophysical journal.

[60]  Alexander G. Fletcher,et al.  Chaste: An Open Source C++ Library for Computational Physiology and Biology , 2013, PLoS Comput. Biol..

[61]  Johan Paulsson,et al.  Non-genetic heterogeneity from stochastic partitioning at cell division , 2011, Nature Genetics.

[62]  Shayn M Peirce,et al.  Multiscale computational models of complex biological systems. , 2013, Annual review of biomedical engineering.

[63]  Zimei Rong,et al.  Needle Enzyme Electrode for Lactate Measurement In Vivo , 2008, IEEE Sensors Journal.

[64]  Sabrina L Spencer,et al.  Non-genetic Cell-to-cell Variability and the Consequences for Pharmacology This Review Comes from a Themed Issue on Omics Edited the Distribution of Protein Abundance and Resulting Variability in Phenotype Measuring Cell-to-cell Variation , 2022 .

[65]  P. Hunter,et al.  Integration from proteins to organs: the Physiome Project , 2003, Nature Reviews Molecular Cell Biology.

[66]  Jay D. Humphrey,et al.  Ensuring Congruency in Multiscale Modeling: Towards Linking Agent Based and Continuum Biomechanical Models of Arterial Adaptation , 2011, Annals of Biomedical Engineering.

[67]  Yin Hoon Chew,et al.  Multiscale digital Arabidopsis predicts individual organ and whole-organism growth , 2014, Proceedings of the National Academy of Sciences.

[68]  Masaru Tomita,et al.  E-CELL: software environment for whole-cell simulation , 1999, Bioinform..

[69]  Jens Timmer,et al.  Summary of the DREAM8 Parameter Estimation Challenge: Toward Parameter Identification for Whole-Cell Models , 2015, PLoS Comput. Biol..

[70]  Fabian J. Theis,et al.  ODE Constrained Mixture Modelling: A Method for Unraveling Subpopulation Structures and Dynamics , 2014, PLoS Comput. Biol..

[71]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[72]  C. Curtis,et al.  A Big Bang model of human colorectal tumor growth , 2015, Nature Genetics.

[73]  S. Sisson,et al.  Likelihood-free Markov chain Monte Carlo , 2010, 1001.2058.