Gene regulatory network inference using PLS-based methods

BackgroundInferring the topology of gene regulatory networks (GRNs) from microarray gene expression data has many potential applications, such as identifying candidate drug targets and providing valuable insights into the biological processes. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of potential interactions.ResultsWe introduce an ensemble gene regulatory network inference method PLSNET, which decomposes the GRN inference problem with p genes into p subproblems and solves each of the subproblems by using Partial least squares (PLS) based feature selection algorithm. Then, a statistical technique is used to refine the predictions in our method. The proposed method was evaluated on the DREAM4 and DREAM5 benchmark datasets and achieved higher accuracy than the winners of those competitions and other state-of-the-art GRN inference methods.ConclusionsSuperior accuracy achieved on different benchmark datasets, including both in silico and in vivo networks, shows that PLSNET reaches state-of-the-art performance.

[1]  S. Wold,et al.  PLS: Partial Least Squares Projections to Latent Structures , 1993 .

[2]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[3]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[4]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[5]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[6]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[7]  Aurélien Mazurie,et al.  Gene networks inference using dynamic Bayesian networks , 2003, ECCB.

[8]  Pedro Mendes,et al.  Artificial gene networks for objective comparison of analysis algorithms , 2003, ECCB.

[9]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[10]  Satoru Miyano,et al.  Inferring gene networks from time series microarray data using dynamic Bayesian networks , 2003, Briefings Bioinform..

[11]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[12]  Paul P. Wang,et al.  Advances to Bayesian network inference for generating causal networks from observational biological data , 2004, Bioinform..

[13]  Timothy S Gardner,et al.  Reverse-engineering transcription control networks. , 2005, Physics of life reviews.

[14]  Kathleen Marchal,et al.  SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms , 2006, BMC Bioinformatics.

[15]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[16]  Adam A. Margolin,et al.  Reverse engineering cellular networks , 2006, Nature Protocols.

[17]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[18]  Vincent Frouin,et al.  Evolutionary approaches for the reverse-engineering of gene regulatory networks: A study on a biologically realistic dataset , 2008, BMC Bioinformatics.

[19]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[20]  BMC Bioinformatics , 2005 .

[21]  Rainer Spang,et al.  Inferring cellular networks – a review , 2007, BMC Bioinformatics.

[22]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[23]  H. Bolouri Computational Modeling of Gene Regulatory Networks - A Primer , 2008 .

[24]  Gianluca Bontempi,et al.  minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information , 2008, BMC Bioinformatics.

[25]  Claudio Cobelli,et al.  A Gene Network Simulator to Assess Reverse Engineering Algorithms , 2009, Annals of the New York Academy of Sciences.

[26]  Dario Floreano,et al.  Generating Realistic In Silico Gene Networks for Performance Assessment of Reverse Engineering Methods , 2009, J. Comput. Biol..

[27]  Wei-Po Lee,et al.  Computational methods for discovering gene networks from expression data , 2009, Briefings Bioinform..

[28]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[29]  D. Floreano,et al.  Revealing strengths and weaknesses of methods for gene network inference , 2010, Proceedings of the National Academy of Sciences.

[30]  Frank Emmert-Streib,et al.  Inferring the conservative causal core of gene regulatory networks , 2010, BMC Systems Biology.

[31]  Dario Floreano,et al.  GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods , 2011, Bioinform..

[32]  Julio Collado-Vides,et al.  RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units) , 2010, Nucleic Acids Res..

[33]  Hongyu Zhao,et al.  Reverse Engineering of Gene Regulation Networks with an Application to the DREAM4 in silico Network Challenge , 2011, Handbook of Statistical Bioinformatics.

[34]  A. Fialho,et al.  The rnb Gene of Synechocystis PCC6803 Encodes a RNA Hydrolase Displaying RNase II and Not RNase R Enzymatic Properties , 2012, PloS one.

[35]  Jean-Philippe Vert,et al.  TIGRESS: Trustful Inference of Gene REgulation using Stability Selection , 2012, BMC Systems Biology.

[36]  Frank Emmert-Streib,et al.  Bagging Statistical Network Inference from Large-Scale Gene Expression Data , 2012, PloS one.

[37]  Ralf Zimmer,et al.  Inferring gene regulatory networks by ANOVA , 2012, Bioinform..

[38]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[39]  Tomasz Arodz,et al.  ENNET: inferring large gene regulatory networks from expression data using gradient boosting , 2013, BMC Systems Biology.

[40]  Guoli Ji,et al.  TotalPLS: Local Dimension Reduction for Multicategory Microarray Data , 2014, IEEE Transactions on Human-Machine Systems.

[41]  Shiquan Sun,et al.  A Kernel-Based Multivariate Feature Selection Method for Microarray Data Classification , 2014, PloS one.

[42]  Piet Demeester,et al.  NIMEFI: Gene Regulatory Network Inference using Multiple Ensemble Feature Importance Algorithms , 2014, PloS one.