Techniques for Parallel Simulation of Large Scale Heterogeneous Biophysical Systems

Recently several multidisciplinary projects have begun to model and simulate human biophysical and physiological systems. These projects aim to create databases of models which can be combined to generate larger and more complex biophysical models. These large scale models can then be used to perform in silico (computer based) simulations to accurately predict the effects of drugs over a range of circumstances and patients. The ultimate goal of these projects is to realize a system for predictive medicine. This uses patient specific models to determine treatment for an ailment in a particular patient or predict side effects of a drug on patients with certain genetic traits. Though this research area has advanced in recent years, there is still a lack of tools to effectively perform these types of simulations, particularly in regards to simulating large scale models on parallel computing platforms. These types of large scale simulations are required to effectively solve the problems posed by predictive medicine. Although many software packages exist to perform biophysical simulations, few of these support parallel simulation of large scale models, and the ones that do generally support only a particular type of model. This dissertation discusses techniques and tools that are applicable to performing parallel simulations of these kinds of large scale models. This subject is addressed on multiple scales. First, we examine techniques for performing parallel simulations by converting models into C++ source code which is compiled and executed. This method is common because of its simplicity and ease of simulation generation. Also, there are tools such as MATLAB and Mathematica which may be used as a back end to perform the computation. However, these types of simulations are difficult to parallelize because of the complexity of the models and difficulty of writing metacode for parallel simulations. In this case, we use model analysis and redundant computation to simplify the parallel simulation yet maintain good performance and calculate the same results. However, this technique is limited because of the time required to compile individual simulations and the complexity as the model becomes more heterogeneous. Next we describe insilicoSim, an extendable simulation engine for performing parallel large scale biophysical simulations. Rather than creating source code for each model, this engine imports models, converts them to internal data structures and performs simulations. This solves the difficulties of source based simulations in dealing with large heterogeneous models and allows new types of models and simulations. We present three key components of the simulator for improving extensibility and performance. First, we demonstrate how a standardized plugin interface allows for easy extension of the simulator to new types of input, output and simulation methods. We detail a technique for improving simulation performance by simplifying and compiling simulation related calculations into a byte code representation for fast evaluation. Finally, we describe the simulation object manager which allows for shared object access between simulation interfaces while transparently performing parallel synchronization. We demonstrate the effectiveness of these methods by simulating several models on both serial and parallel computing platforms. Finally, we demonstrate a method for utilizing large scale unreliable computing resources to perform parallel computations such as those used in modeling biomolecular dynamics. We propose algorithms for computing batches of medium grained tasks with deadlines in pull-style volunteer computing environments. First we develop models of unreliable workers based on analysis of trace data from an actual volunteer computing project. These models are used to develop algorithms for task distribution in volunteer computing systems with a high probability of meeting batch deadlines. We develop algorithms for perfectly reliable workers, computation-reliable workers and unreliable workers. Finally, we demonstrate the effectiveness of the algorithms through simulations using traces from actual volunteer computing environments.

[1]  Vijay S. Pande,et al.  Folding@Home and Genome@Home: Using distributed computing to tackle previously intractable problem , 2009, 0901.0866.

[2]  Irina V. Biktasheva,et al.  High Performance Computing for the Simulation of Cardiac Electrophysiology , 2008, 2008 The Third International Conference on Software Engineering Advances.

[3]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[4]  D. McCrea,et al.  Modelling spinal circuitry involved in locomotor pattern generation: insights from the effects of afferent stimulation , 2006, The Journal of physiology.

[5]  James M. Bower,et al.  Constructing realistic neural simulations with GENESIS. , 2007, Methods in molecular biology.

[6]  D. Noble Modeling the Heart--from Genes to Cells to the Whole Organ , 2002, Science.

[7]  A. Lansner Associative memory models: from the cell-assembly theory to biophysically detailed cortex simulations , 2009, Trends in Neurosciences.

[8]  Yoshiyuki Asai,et al.  Specifications of insilicoML 1.0: a multilevel biophysical model description language. , 2008, The journal of physiological sciences : JPS.

[9]  Satoshi Matsuoka,et al.  simBio: a Java package for the development of detailed cell models. , 2006, Progress in biophysics and molecular biology.

[10]  Nicolas Le Novère,et al.  STOCHSIM: modelling of stochastic biomolecular processes , 2001, Bioinform..

[11]  Jan-Hendrik S. Hofmeyr,et al.  Modelling cellular systems with PySCeS , 2005, Bioinform..

[12]  Igor Goryanin,et al.  Mathematical simulation and analysis of cellular metabolism and regulation , 1999, Bioinform..

[13]  Herbert M. Sauro,et al.  33 JARNAC: a system for interactive metabolic analysis , 2000 .

[14]  Sergey Missan,et al.  CESE: Cell Electrophysiology Simulation Environment. , 2005, Applied bioinformatics.

[15]  D. Noble Music of life : biology beyond the genome , 2006 .

[16]  Carol S. Woodward,et al.  Enabling New Flexibility in the SUNDIALS Suite of Nonlinear and Differential/Algebraic Equation Solvers , 2020, ACM Trans. Math. Softw..

[17]  Abhishek Chandra,et al.  Ridge: combining reliability and performance in open grid platforms , 2007, HPDC '07.

[18]  Masao Nagasaki,et al.  Genomic Object Net: I. A platform for modelling and simulating biopathways. , 2003, Applied bioinformatics.

[19]  Chong-Sun Hwang,et al.  Scheduling Scheme based on Dedication Rate in Volunteer Computing Environment , 2005, The 4th International Symposium on Parallel and Distributed Computing (ISPDC'05).

[20]  P Mendes,et al.  Biochemistry by numbers: simulation of biochemical pathways with Gepasi 3. , 1997, Trends in biochemical sciences.

[21]  Peter J. Hunter,et al.  An Overview of CellML 1.1, a Biological Model Description Language , 2003, Simul..

[22]  Masaru Tomita,et al.  Toward large-scale modeling of the microbial cell for computer simulation. , 2004, Journal of biotechnology.

[23]  Gilles Fedak,et al.  XtremLab: A System for Characterizing Internet Desktop Grids , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[24]  J.M. Schopf,et al.  Stochastic Scheduling , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[25]  Olac Fuentes,et al.  A distributed evolutionary method to design scheduling policies for volunteer computing , 2008, PERV.

[26]  D. W. Saunders,et al.  Large elastic deformations of isotropic materials VII. Experiments on the deformation of rubber , 1951, Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences.

[27]  Kenichi Hagihara,et al.  Optimization Techniques for Parallel Biophysical Simulations Generated by insilicoIDE , 2009 .

[28]  Rudolf Hornig,et al.  An overview of the OMNeT++ simulation environment , 2008, Simutools 2008.

[29]  James B. Bassingthwaighte,et al.  Strategies for the Physiome Project , 2000, Annals of Biomedical Engineering.

[30]  Andrew A. Chien,et al.  Henri Casanova , 2022 .

[31]  David P. Anderson,et al.  Ensuring Collective Availability in Volatile Resource Pools Via Forecasting , 2008, DSOM.

[32]  A. Hodgkin,et al.  A quantitative description of membrane current and its application to conduction and excitation in nerve , 1990 .

[33]  Mike Holcombe,et al.  Formal agent-based modelling of intracellular chemical interactions. , 2006, Bio Systems.

[34]  Samik Ghosh,et al.  iSimBioSys: a discrete event simulation platform for 'in silico' study of biological systems , 2006, 39th Annual Simulation Symposium (ANSS'06).

[35]  Ilya A. Rybak,et al.  Modeling neural mechanisms for genesis of respiratory rhythm and pattern. I. Models of respiratory neurons. , 1997, Journal of neurophysiology.

[36]  Joel R Stiles,et al.  Rapid creation, Monte Carlo simulation, and visualization of realistic 3D cell models. , 2009, Methods in molecular biology.

[37]  M L Hines,et al.  Neuron: A Tool for Neuroscientists , 2001, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[38]  Kenichi Hagihara,et al.  Computing Low Latency Batches with Unreliable Workers in Volunteer Computing Environments , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[39]  N. Kikuchi,et al.  CellDesigner 3.5: A Versatile Modeling Tool for Biochemical Networks , 2008, Proceedings of the IEEE.

[40]  Gilles Fedak,et al.  The Computational and Storage Potential of Volunteer Computing , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[41]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[42]  M. Poolman ScrumPy: metabolic modelling with Python. , 2006, Systems biology.

[43]  Luís Moura Silva,et al.  Validating Desktop Grid Results By Comparing Intermediate Checkpoints , 2006, CoreGRID Integration Workshop.

[44]  Denis Noble,et al.  Cellular Open Resource (COR): current status and future directions , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[45]  Patrick Lincoln,et al.  BioSPICE: access to the most current computational tools for biologists. , 2003, Omics : a journal of integrative biology.

[46]  Abhishek Chandra,et al.  Adaptive Reputation-Based Scheduling on Unreliable Distributed Infrastructures , 2007, IEEE Transactions on Parallel and Distributed Systems.

[47]  Adelinde M. Uhrmacher,et al.  A parallel and distributed discrete event approach for spatial cell-biological simulations , 2008, PERV.

[48]  Hiroaki Kobayashi,et al.  Implementation and evaluation of a distributed and cooperative load-balancing mechanism for dependable volunteer computing , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[49]  Patrick Lambrix,et al.  A review of standards for data exchange within systems biology , 2007, Proteomics.

[50]  Michael J. North,et al.  AgentCell: a digital single-cell assay for bacterial chemotaxis , 2005, Bioinform..

[51]  Andrew A. Chien,et al.  Resource Management for Rapid Application Turnaround on Enterprise Desktop Grids , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[52]  D. McCrea,et al.  Modelling spinal circuitry involved in locomotor pattern generation: insights from deletions during fictive locomotion , 2006, The Journal of physiology.

[53]  J L Puglisi,et al.  LabHEART: an interactive computer model of rabbit ventricular myocyte ion channels and Ca transport. , 2001, American journal of physiology. Cell physiology.

[54]  M. Stephens EDF Statistics for Goodness of Fit and Some Comparisons , 1974 .

[55]  Örjan Ekeberg,et al.  Brain-scale simulation of the neocortex on the IBM Blue Gene/L supercomputer , 2008, IBM J. Res. Dev..

[56]  Gilles Fedak,et al.  Towards Soft Real-Time Applications on Enterprise Desktop Grids , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[57]  Tadashi Kitamura,et al.  The development of a physiological simulation system for the human circulatory system coupling macro and micro models , 2003, Systems and Computers in Japan.

[58]  Gilles Fedak,et al.  Characterizing resource availability in enterprise desktop grids , 2007, Future Gener. Comput. Syst..

[59]  J C Schaff,et al.  Virtual Cell modelling and simulation software environment. , 2008, IET systems biology.

[60]  V. Pande,et al.  Structural correspondence between the α-helix and the random-flight chain resolves how unfolded proteins can have native-like properties , 2003, Nature Structural Biology.

[61]  Aoxiang Xu,et al.  Two forms of spiral-wave reentry in an ionic model of ischemic ventricular myocardium. , 1998, Chaos.

[62]  Semahat S. Demir,et al.  Interactive Cell Modeling Web-Resource, iCell, as a Simulation-Based Teaching and Learning Tool to Supplement Electrophysiology Education , 2006, Annals of Biomedical Engineering.

[63]  Richard Wolski,et al.  Modeling Machine Availability in Enterprise and Wide-Area Distributed Computing Environments , 2005, Euro-Par.

[64]  Sarah M. Keating,et al.  Evolving a lingua franca and associated software infrastructure for computational systems biology: the Systems Biology Markup Language (SBML) project. , 2004, Systems biology.

[65]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[66]  Mudita Singhal,et al.  Bioinformatics Original Paper Copasi—a Complex Pathway Simulator , 2022 .

[67]  Marc-Oliver Gewaltig,et al.  Efficient Parallel Simulation of Large-Scale Neuronal Networks on Clusters of Multiprocessor Computers , 2007, Euro-Par.

[68]  David McMillen,et al.  Biochemical Network Stochastic Simulator (BioNetS): software for stochastic modeling of biochemical networks , 2004, BMC Bioinformatics.

[69]  M. Hereld,et al.  Developing a petascale neural simulation , 2004, The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[70]  Dharmendra S. Modha,et al.  Anatomy of a cortical simulator , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[71]  Peter J. Hunter,et al.  CellML metadata standards, associated tools and repositories , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[72]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[73]  P J Hunter,et al.  The IUPS Physiome Project: a framework for computational physiology. , 2004, Progress in biophysics and molecular biology.

[74]  Y. Suzuki,et al.  A Platform for in silico Modeling of Physiological Systems , 2007, 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[75]  Yoshiyuki Asai,et al.  A platform for in silico modeling of physiological systems II. CellML compatibility and other extended capabilities , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[76]  Courtenay T. Vaughan,et al.  Zoltan data management services for parallel dynamic applications , 2002, Comput. Sci. Eng..

[77]  Klaus Schuch,et al.  PCSIM: A Parallel Simulation Environment for Neural Circuits Fully Integrated with Python , 2008, Frontiers Neuroinformatics.

[78]  H. Markram The Blue Brain Project , 2006, Nature Reviews Neuroscience.

[79]  Michael J. Lewis,et al.  Scheduling on the Grid via multi-state resource availability prediction , 2008, 2008 9th IEEE/ACM International Conference on Grid Computing.

[80]  Sean Ekins,et al.  Reengineering the pharmaceutical industry by crash-testing molecules. , 2005, Drug discovery today.

[81]  Bruce E. Shapiro,et al.  Cellerator: extending a computer algebra system to include biochemical arrows for signal transduction simulations , 2003, Bioinform..

[82]  L. F. Perrone,et al.  SBW – A MODULAR FRAMEWORK FOR SYSTEMS BIOLOGY , 2006 .

[83]  Peter J. Hunter,et al.  Computational multiscale modeling in the IUPS Physiome Project: Modeling cardiac electromechanics , 2006, IBM J. Res. Dev..

[84]  Philippe Golle,et al.  Uncheatable Distributed Computations , 2001, CT-RSA.

[85]  Carsten Kutzner,et al.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. , 2008, Journal of chemical theory and computation.

[86]  David P. Anderson,et al.  On correlated availability in Internet-distributed systems , 2008, 2008 9th IEEE/ACM International Conference on Grid Computing.

[87]  Taishin Nomura Challenges of Physiome Projects , 2007 .

[88]  Laxmikant V. Kalé,et al.  NAMD: Biomolecular Simulation on Thousands of Processors , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[89]  C. Luo,et al.  A model of the ventricular cardiac action potential. Depolarization, repolarization, and their interaction. , 1991, Circulation research.

[90]  Andrew D McCulloch,et al.  Integrative biological modelling in silico. , 2002, Novartis Foundation symposium.

[91]  L. Loew,et al.  The Virtual Cell: a software environment for computational cell biology. , 2001, Trends in biotechnology.