Parallel stochastic simulations of budding yeast cell cycle: load balancing strategies and theoretical analysis

The evolution of biochemical systems, where some chemical species are present with only a small numbers of molecules, is strongly influenced by discrete and stochastic effects. This evolution cannot be accurately captured by continuous and deterministic models, and special stochastic models are required. The budding yeast cell cycle provides an excellent example of the need for capturing stochastic effects in biochemical reactions. To obtain statistics of the cell evolution, a stochastic simulation algorithm must be run thousands of times with different initial conditions and parameter values. In order to manage the computational expense the large ensemble of runs needs to be executed in parallel. Each individual task is a stochastic simulation. The CPU time per task is unknown, and can vary considerably from one individual simulation to another. Because of this variability serious load imbalances appear and may considerably affect the efficiency of the parallel computation. This paper proposes two dynamic load balancing strategies for parallel runs of large ensembles of stochastic simulations of biological systems. A new probabilistic analysis framework is developed in order to quantify the performance of the load balancing algorithms when the CPU times per task are not known in advance. Simulation results with a stochastic budding yeast cell cycle model confirm the theoretical analysis. While this work is motivated by cell cycle modeling, the proposed analysis framework is general and can be directly applied to any ensemble simulation where many tasks are mapped onto each processor, and where the individual compute times vary considerably among tasks.

[1]  Katherine C. Chen,et al.  Kinetic analysis of a molecular model of the budding yeast cell cycle. , 2000, Molecular biology of the cell.

[2]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[3]  A. Arkin,et al.  Stochastic mechanisms in gene expression. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Katherine C. Chen,et al.  Integrative analysis of cell cycle control in budding yeast. , 2004, Molecular biology of the cell.

[5]  Richard E. Korf,et al.  Depth-First Heuristic Search on a SIMD Machine , 1993, Artif. Intell..

[6]  Clifford A. Shaffer,et al.  The JigCell Model Builder: a spreadsheet interface for creating biochemical reaction network models , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Wesley W. Chu,et al.  Task Allocation in Distributed Data Processing , 1980, Computer.

[8]  W. Daniel Hillis,et al.  The connection machine , 1985 .

[9]  Bruce P. Lester The art of parallel programming , 1993 .

[10]  A. Murray,et al.  The Cell Cycle: An Introduction , 1993 .

[11]  Fernando Gustavo Tinetti,et al.  Parallel programming: techniques and applications using networked workstations and parallel computers. Barry Wilkinson, C. Michael Allen , 2000 .

[12]  Kishor S. Trivedi Probability and Statistics with Reliability, Queuing, and Computer Science Applications , 1984 .

[13]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[14]  Clifford A. Shaffer,et al.  Stochastic cell cycle modeling for budding yeast , 2009, SpringSim '09.

[15]  Clifford A. Shaffer,et al.  Cell Cycle Modeling for Budding Yeast with Stochastic Simulation Algorithms , 2008 .

[16]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[17]  J. Rice Mathematical Statistics and Data Analysis , 1988 .

[18]  Michael Allen,et al.  Parallel programming: techniques and applications using networked workstations and parallel computers , 1998 .