GeneSPIDER - gene regulatory network inference benchmarking with controlled network and data properties.

A key question in network inference, that has not been properly answered, is what accuracy can be expected for a given biological dataset and inference method. We present GeneSPIDER - a Matlab package for tuning, running, and evaluating inference algorithms that allows independent control of network and data properties to enable data-driven benchmarking. GeneSPIDER is uniquely suited to address this question by first extracting salient properties from the experimental data and then generating simulated networks and data that closely match these properties. It enables data-driven algorithm selection, estimation of inference accuracy from biological data, and a more multifaceted benchmarking. Included are generic pipelines for the design of perturbation experiments, bootstrapping, analysis of linear dependence, sample selection, scaling of SNR, and performance evaluation. With GeneSPIDER we aim to move the goal of network inference benchmarks from simple performance measurement to a deeper understanding of how the accuracy of an algorithm is determined by different combinations of network and data properties.

[1]  Mark D. McDonnell,et al.  Methods for Generating Complex Networks with Selected Structural Properties for Simulations: A Review and Tutorial for Neuroscientists , 2011, Front. Comput. Neurosci..

[2]  Sean C. Warnick,et al.  Robust dynamical network structure reconstruction , 2011, Autom..

[3]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[4]  Julio Collado-Vides,et al.  RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more , 2012, Nucleic Acids Res..

[5]  N. Lytkin,et al.  A comprehensive assessment of methods for de-novo reverse-engineering of genome-scale regulatory networks. , 2011, Genomics.

[6]  Christian L. Müller,et al.  Sparse and Compositionally Robust Inference of Microbial Ecological Networks , 2014, PLoS Comput. Biol..

[7]  Kathleen Marchal,et al.  SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms , 2006, BMC Bioinformatics.

[8]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[9]  B. Bollobás The evolution of random graphs , 1984 .

[10]  Catarina Costa,et al.  The YEASTRACT database: an upgraded information system for the analysis of gene and genomic transcription regulation in Saccharomyces cerevisiae , 2013, Nucleic Acids Res..

[11]  Ralf Herwig,et al.  Reverse Engineering of Gene Regulatory Networks: A Comparative Study , 2009, EURASIP J. Bioinform. Syst. Biol..

[12]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[13]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[14]  Simon Rogers,et al.  A Bayesian regression approach to the inference of regulatory networks from gene expression data , 2005, Bioinform..

[15]  Stephen P. Boyd,et al.  Inferring stable genetic networks from steady-state data , 2011, Autom..

[16]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[17]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[18]  Ralf Herwig,et al.  GeNGe: systematic generation of gene regulatory networks , 2009, Bioinform..

[19]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[20]  Torbjörn E. M. Nordling,et al.  Interampatteness - a generic property of biochemical networks. , 2009, IET systems biology.

[21]  Christopher A. Penfold,et al.  How to infer gene networks from expression profiles, revisited , 2011, Interface Focus.

[22]  Claudio Cobelli,et al.  A Gene Network Simulator to Assess Reverse Engineering Algorithms , 2009, Annals of the New York Academy of Sciences.

[23]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[24]  Philippe Salembier,et al.  NetBenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference , 2015, BMC Bioinformatics.

[25]  Erik L. L. Sonnhammer,et al.  Optimal Sparsity Criteria for Network Inference , 2013, J. Comput. Biol..

[26]  Dario Floreano,et al.  GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods , 2011, Bioinform..

[27]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[28]  Torbjörn E. M. Nordling,et al.  Robust Inference of Gene Regulatory Networks , 2012, ICONS 2012.

[29]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[30]  Torbjörn E. M. Nordling,et al.  Avoiding pitfalls in L1-regularised inference of gene networks. , 2015, Molecular bioSystems.

[31]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.