Nested effects models for high-dimensional phenotyping screens

MOTIVATION In high-dimensional phenotyping screens, a large number of cellular features is observed after perturbing genes by knockouts or RNA interference. Comprehensive analysis of perturbation effects is one of the most powerful techniques for attributing functions to genes, but not much work has been done so far to adapt statistical and computational methodology to the specific needs of large-scale and high-dimensional phenotyping screens. RESULTS We introduce and compare probabilistic methods to efficiently infer a genetic hierarchy from the nested structure of observed perturbation effects. These hierarchies elucidate the structures of signaling pathways and regulatory networks. Our methods achieve two goals: (1) they reveal clusters of genes with highly similar phenotypic profiles, and (2) they order (clusters of) genes according to subset relationships between phenotypes. We evaluate our algorithms in the controlled setting of simulation studies and show their practical use in two experimental scenarios: (1) a data set investigating the response to microbial challenge in Drosophila melanogaster, and (2) a compendium of expression profiles of Saccharomyces cerevisiae knockout strains. We show that our methods identify biologically justified genetic hierarchies of perturbation effects. AVAILABILITY The software used in our analysis is freely available in the R package 'nem' from www.bioconductor.org.

[1]  Alexander Schliep,et al.  ProClust: improved clustering of protein sequences with an extended graph-based approach , 2002, ECCB.

[2]  Gavin Sherlock,et al.  Global analysis of gene function in yeast by quantitative phenotypic profiling , 2006, Molecular systems biology.

[3]  J. Hoffmann,et al.  Sensing and signaling during infection in Drosophila. , 2005, Current opinion in immunology.

[4]  N. Perrimon,et al.  Sequential activation of signaling pathways during innate immune responses in Drosophila. , 2002, Developmental cell.

[5]  Andreas Wagner,et al.  How to reconstruct a large genetic network from n gene perturbations in fewer than n2 easy steps , 2001, Bioinform..

[6]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[7]  N. Perrimon,et al.  Genome-Wide RNAi Analysis of Growth and Viability in Drosophila Cells , 2004, Science.

[8]  Nir Friedman,et al.  Inferring subnetworks from perturbed expression profiles , 2001, ISMB.

[9]  Rainer Spang,et al.  Evaluating the effect of perturbations in reconstructing network topologies , 2003 .

[10]  F. Piano,et al.  Gene Clustering Based on RNAi Phenotypes of Ovary-Enriched Genes in C. elegans , 2002, Current Biology.

[11]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[12]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[13]  Marco Grzegorczyk,et al.  Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks , 2006, Bioinform..

[14]  Matthew A. Hibbs,et al.  Finding function: evaluation methods for functional genomic data , 2006, BMC Genomics.

[15]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[16]  Taro L. Saito,et al.  High-dimensional and large-scale phenotyping of yeast mutants. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[17]  A. Fire,et al.  Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans , 1998, Nature.

[18]  D. Kell,et al.  A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations , 2001, Nature Biotechnology.

[19]  Rainer Spang,et al.  Non-transcriptional pathway features reconstructed from secondary effects of RNA interference , 2005, Bioinform..

[20]  K. Strimmer,et al.  Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies , 1996 .

[21]  Andreas Wagner,et al.  Estimating coarse gene network structure from large-scale gene perturbation data. , 2002, Genome research.

[22]  P. Bühlmann,et al.  Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana , 2004, Genome Biology.

[23]  M. Gerstein,et al.  Genomic analysis of the hierarchical structure of regulatory networks , 2006, Proceedings of the National Academy of Sciences.

[24]  Tommi S. Jaakkola,et al.  Physical Network Models , 2004, J. Comput. Biol..

[25]  N. J. A. Sloane,et al.  The On-Line Encyclopedia of Integer Sequences , 2003, Electron. J. Comb..

[26]  Ezgi O. Booth,et al.  Epistasis analysis with global transcriptional phenotypes , 2005, Nature Genetics.

[27]  Michael Boutros,et al.  An RNA interference screen ide.jpgies Inhibitor of Apoptosis Protein 2 as a regulator of innate immune signalling in Drosophila , 2005, EMBO reports.

[28]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[29]  Jaak Vilo,et al.  Building and analysing genome-wide gene disruption networks , 2002, ECCB.

[30]  Luis M. de Campos,et al.  Searching for Bayesian Network Structures in the Space of Restricted Acyclic Partially Directed Graphs , 2011, J. Artif. Intell. Res..

[31]  Isabel M. Tienda-Luna,et al.  Reverse engineering gene regulatory networks , 2009, IEEE Signal Processing Magazine.

[32]  Kristin C. Gunsalus,et al.  RNAiDB and PhenoBlast: web tools for genome-wide phenotypic mapping projects , 2004, Nucleic Acids Res..

[33]  A. Hartemink Reverse engineering gene regulatory networks , 2005, Nature Biotechnology.