The feasibility of genome-scale biological network inference using Graphics Processing Units

Systems research spanning fields from biology to finance involves the identification of models to represent the underpinnings of complex systems. Formal approaches for data-driven identification of network interactions include statistical inference-based approaches and methods to identify dynamical systems models that are capable of fitting multivariate data. Availability of large data sets and so-called ‘big data’ applications in biology present great opportunities as well as major challenges for systems identification/reverse engineering applications. For example, both inverse identification and forward simulations of genome-scale gene regulatory network models pose compute-intensive problems. This issue is addressed here by combining the processing power of Graphics Processing Units (GPUs) and a parallel reverse engineering algorithm for inference of regulatory networks. It is shown that, given an appropriate data set, information on genome-scale networks (systems of 1000 or more state variables) can be inferred using a reverse-engineering algorithm in a matter of days on a small-scale modern GPU cluster.

[1]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[2]  Linda R. Petzold,et al.  Algorithms and software for ordinary differential equations and differential-algebraic equations, part II: higher-order methods and software packages , 1995 .

[3]  Atul J. Butte,et al.  Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges , 2012, PLoS Comput. Biol..

[4]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[5]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[6]  Feng Qi,et al.  A parallel algorithm for reverse engineering of biological networks. , 2011, Integrative biology : quantitative biosciences from nano to macro.

[7]  Amitabh Varshney,et al.  High-throughput sequence alignment using Graphics Processing Units , 2007, BMC Bioinformatics.

[8]  Bernd Meyer,et al.  Accelerating reaction-diffusion simulations with general-purpose graphics processing units , 2011, Bioinform..

[9]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[10]  Giancarlo Mauri,et al.  cuTauLeaping: A GPU-Powered Tau-Leaping Stochastic Simulator for Massive Parallel Analyses of Biological Systems , 2014, PloS one.

[11]  Giancarlo Mauri,et al.  GPU-accelerated simulations of mass-action kinetics models with cupSODA , 2014, The Journal of Supercomputing.

[12]  Xing Li,et al.  The Inferred Cardiogenic Gene Regulatory Network in the Mammalian Heart , 2013, PloS one.

[13]  Cyril Fischer Massive parallel implementation of ODE solvers , 2013 .

[14]  Kayvan Najarian,et al.  Big Data Analytics in Healthcare , 2015, BioMed research international.

[15]  Marco S. Nobile,et al.  Graphics processing units in bioinformatics, computational biology and systems biology , 2016, Briefings Bioinform..

[16]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[17]  Jean-François Méhaut,et al.  Density functional theory calculation on many-cores hybrid central processing unit-graphic processing unit architectures. , 2009, The Journal of chemical physics.

[18]  Bożena Małysiak-Mrozek,et al.  Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA , 2014, Journal of Molecular Modeling.

[19]  Anna Maria Almerico,et al.  Molecular dynamics, dynamic site mapping, and highthroughput virtual screening on leptin and the Ob receptor as anti-obesity target , 2014, Journal of Molecular Modeling.

[20]  Michael P. H. Stumpf,et al.  GPU accelerated biochemical network simulation , 2011, Bioinform..

[21]  D. Floreano,et al.  Revealing strengths and weaknesses of methods for gene network inference , 2010, Proceedings of the National Academy of Sciences.

[22]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[23]  Lorenzo Dematté,et al.  GPU computing for systems biology , 2010, Briefings Bioinform..

[24]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[25]  Aiguo Li,et al.  FastMEDUSA: a parallelized tool to infer gene regulatory networks , 2010, Bioinform..

[26]  Francis R. Bach,et al.  Bolasso: model consistent Lasso estimation through the bootstrap , 2008, ICML '08.

[27]  Tao Jiang,et al.  OligoSpawn: a software tool for the design of overgo probes from large unigene datasets , 2006, BMC Bioinformatics.

[28]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[29]  Joshua A. Anderson,et al.  General purpose molecular dynamics simulations fully implemented on graphics processing units , 2008, J. Comput. Phys..

[30]  Anil Wipat,et al.  SARGE: a tool for creation of putative genetic networks , 2004, Bioinform..

[31]  Mario Mulansky,et al.  Odeint - Solving ordinary differential equations in C++ , 2011, ArXiv.

[32]  Guy Karlebach,et al.  Modelling and analysis of gene regulatory networks , 2008, Nature Reviews Molecular Cell Biology.

[33]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[34]  Jean-Philippe Vert,et al.  TIGRESS: Trustful Inference of Gene REgulation using Stability Selection , 2012, BMC Systems Biology.

[35]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[36]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.