Learning undirected graphical models from multiple datasets with the generalized non-rejection rate

Learning graphical models from multiple datasets constitutes an appealing approach to learn transcriptional regulatory interactions from microarray data in the field of molecular biology. This has been approached both in a model based learning approach and in a model free learning approach where, in the latter, it is common practice to pool datasets produced under different experimental conditions. In this paper, we introduce a quantity called the generalized non-rejection rate which extends the non-rejection rate, introduced by [3], so as to explicitly keep into account the different graphical models representing distinct experimental conditions involved in the structure of the dataset produced in multiple experimental batches. We show that the generalized non-rejection rate allows one to learn the common edges occurring throughout all graphical models, making it specially suited to identify robust transcriptional interactions which are common to all the considered experiments. The generalized non-rejection rate is then applied to both synthetic and real data and shown to provide competitive performance with respect to other widely used methods.

[1]  Sophie Lèbre,et al.  Statistical Applications in Genetics and Molecular Biology Inferring Dynamic Genetic Networks with Low Order Independencies Inferring Dynamic Genetic Networks with Low Order Independencies ∗ , 2009 .

[2]  Trupti Joshi,et al.  Inferring gene regulatory networks from multiple microarray datasets , 2006, Bioinform..

[3]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[4]  Markus J. Herrgård,et al.  Integrating high-throughput and computational data elucidates bacterial networks , 2004, Nature.

[5]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[6]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[7]  Alberto de la Fuente,et al.  Discovery of meaningful associations in genomic data using partial correlation coefficients , 2004, Bioinform..

[8]  Florence d'Alché-Buc,et al.  INFERENCE OF BIOLOGICAL REGULATORY NETWORKS: MACHINE LEARNING APPROACHES , 2007 .

[9]  Julio Collado-Vides,et al.  RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation , 2007, Nucleic Acids Res..

[10]  Steffen L. Lauritzen,et al.  Graphical models in R , 1996 .

[11]  Jesper Tegnér,et al.  Towards scalable and data efficient learning of Markov boundaries , 2007, Int. J. Approx. Reason..

[12]  Moninder Singh,et al.  Construction of Bayesian network structures from data: A brief survey and an efficient algorithm , 1995, Int. J. Approx. Reason..

[13]  Robert Castelo,et al.  Reverse Engineering Molecular Regulatory Networks from Microarray Data with qp-Graphs , 2009, J. Comput. Biol..

[14]  Björn H. Junker Networks in Biology , 2007 .

[15]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[16]  Emma Steele,et al.  Consensus and Meta-analysis regulatory networks for combining multiple microarray gene expression datasets , 2008, J. Biomed. Informatics.

[17]  Paul M. Magwene,et al.  Estimating genomic coexpression networks using first-order conditional independence , 2004, Genome Biology.

[18]  Björn H. Junker,et al.  Comprar Analysis of Biological Networks | Bjorn H. Junker | 9780470041444 | Wiley , 2008 .

[19]  Robert Castelo,et al.  A Robust Procedure For Gaussian Graphical Model Search From Microarray Data With p Larger Than n , 2006, J. Mach. Learn. Res..

[20]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[21]  Dennis B. Troup,et al.  NCBI GEO: mining tens of millions of expression profiles—database and tools update , 2006, Nucleic Acids Res..

[22]  Claudio Altafini,et al.  Comparing association network algorithms for reverse engineering of large-scale gene regulatory networks: synthetic versus real data , 2007, Bioinform..

[23]  E. Levina,et al.  Joint estimation of multiple graphical models. , 2011, Biometrika.

[24]  Isaac S. Kohane,et al.  Relevance Networks: A First Step Toward Finding Genetic Regulatory Networks Within Microarray Data , 2003 .

[25]  G. Parmigiani,et al.  The Analysis of Gene Expression Data , 2003 .

[26]  A. Fuente,et al.  From ‘differential expression’ to ‘differential networking’ – identification of dysfunctional regulatory networks in diseases , 2010 .

[27]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[28]  P. Bühlmann,et al.  Statistical Applications in Genetics and Molecular Biology Low-Order Conditional Independence Graphs for Inferring Genetic Networks , 2011 .

[29]  D. Edwards Introduction to graphical modelling , 1995 .