A Wall-time Minimizing Parallelization Strategy for Approximate Bayesian Computation

Approximate Bayesian Computation (ABC) is a widely applicable and popular approach to estimating unknown parameters of mechanistic models. As ABC analyses are computationally expensive, parallelization on high-performance infrastructure is often necessary. However, the existing parallelization strategies leave resources unused at times and thus do not optimally leverage them yet. We present look-ahead scheduling, a wall-time minimizing parallelization strategy for ABC Sequential Monte Carlo algorithms, which utilizes all available resources at practically all times by proactive sampling for prospective tasks. Our strategy can be integrated in e.g. adaptive distance function and summary statistic selection schemes, which is essential in practice. Evaluation of the strategy on different problems and numbers of parallel cores reveals speed-ups of typically 10-20% and up to 50% compared to the best established approach. Thus, the proposed strategy allows to substantially improve the cost and run-time efficiency of ABC methods on high-performance infrastructure.

[1]  J. Hasenauer,et al.  FitMultiCell: simulating and parameterizing computational models of multi-scale and multi-cellular processes , 2023, bioRxiv.

[2]  J. Hasenauer,et al.  pyABC: Efficient and robust easy-to-use approximate Bayesian computation , 2022, J. Open Source Softw..

[3]  J. Hasenauer,et al.  Robust adaptive distance functions for approximate Bayesian inference on outlier-corrupted data , 2021, bioRxiv.

[4]  J. Hasenauer,et al.  HCV Spread Kinetics Reveal Varying Contributions of Transmission Modes to Infection Dynamics , 2021, bioRxiv.

[5]  Y. Kalaidzidis,et al.  Bile canaliculi remodeling activates YAP via the actin cytoskeleton during liver regeneration , 2020, Molecular systems biology.

[6]  William J. Godinez,et al.  Experimental and computational analyses reveal that environmental restrictions shape HIV-1 spread in 3D cultures , 2019, Nature Communications.

[7]  Jan Hasenauer,et al.  pyABC: distributed, likelihood-free inference , 2017, bioRxiv.

[8]  Yanan Fan,et al.  Handbook of Approximate Bayesian Computation , 2018 .

[9]  Jan Hasenauer,et al.  A Scheme for Adaptive Selection of Population Sizes in Approximate Bayesian Computation - Sequential Monte Carlo , 2017, CMSB.

[10]  Jukka-Pekka Onnela,et al.  ABCpy: A User-Friendly, Extensible, and Parallel Library for Approximate Bayesian Computation , 2017, PASC.

[11]  Jan Hasenauer,et al.  Parallelization and High-Performance Computing Enables Automated Statistical Inference of Multi-scale Models. , 2017, Cell systems.

[12]  Aki Vehtari,et al.  ELFI: Engine for Likelihood Free Inference , 2016, J. Mach. Learn. Res..

[13]  Xin Ming,et al.  Multicellular Tumor Spheroids as a Model for Assessing Delivery of Oligonucleotides in Three Dimensions , 2014, Molecular therapy. Nucleic acids.

[14]  Walter de Back,et al.  Morpheus: a user-friendly modeling environment for multiscale and multicellular systems biology , 2014, Bioinform..

[15]  Nick Jagiella,et al.  Parameterization of Lattice-Based Tumor Models from Data. (Parameterization des modeles tumoral bases sur des maillages des donnees experimentaux) , 2012 .

[16]  Jonathan R. Karr,et al.  A Whole-Cell Computational Model Predicts Phenotype from Genotype , 2012, Cell.

[17]  Paul Fearnhead,et al.  Constructing summary statistics for approximate Bayesian computation: semi‐automatic approximate Bayesian computation , 2012 .

[18]  Julien Cornebise,et al.  On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo , 2011, Statistical applications in genetics and molecular biology.

[19]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[20]  R. Wilkinson Approximate Bayesian computation (ABC) gives exact results under the assumption of model error , 2008, Statistical applications in genetics and molecular biology.

[21]  C. Robert,et al.  Adaptive approximate Bayesian computation , 2008, 0805.2256.

[22]  Mark M. Tanaka,et al.  Sequential Monte Carlo without likelihoods , 2007, Proceedings of the National Academy of Sciences.

[23]  W. R. Howard The Nature of Mathematical Modeling , 2006 .

[24]  Albert Tarantola,et al.  Inverse problem theory - and methods for model parameter estimation , 2004 .

[25]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[26]  H. Kitano Systems Biology: A Brief Overview , 2002, Science.

[27]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[28]  Jun S. Liu,et al.  Rejection Control and Sequential Importance Sampling , 1998 .

[29]  P. Donnelly,et al.  Inferring coalescence times from DNA sequence data. , 1997, Genetics.

[30]  H. Shaffer,et al.  Annual review of ecology, evolution, and systematics , 2003 .