Assessing the functional structure of genomic data

Motivation: The availability of genome-scale data has enabled an abundance of novel analysis techniques for investigating a variety of systems-level biological relationships. As thousands of such datasets become available, they provide an opportunity to study high-level associations between cellular pathways and processes. This also allows the exploration of shared functional enrichments between diverse biological datasets, and it serves to direct experimenters to areas of low data coverage or with high probability of new discoveries. Results: We analyze the functional structure of Saccharomyces cerevisiae datasets from over 950 publications in the context of over 140 biological processes. This includes a coverage analysis of biological processes given current high-throughput data, a data-driven map of associations between processes, and a measure of similar functional activity between genome-scale datasets. This uncovers subtle gene expression similarities in three otherwise disparate microarray datasets due to a shared strain background. We also provide several means of predicting areas of yeast biology likely to benefit from additional high-throughput experimental screens. Availability: Predictions are provided in supplementary tables; software and additional data are available from the authors by request. Contact: ogt@princeton.edu Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[2]  Olga G. Troyanskaya,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm332 Data and text mining , 2022 .

[3]  Moses Charikar,et al.  Greedy approximation algorithms for finding dense components in a graph , 2000, APPROX.

[4]  Michael Q. Zhang,et al.  SCPD: a promoter database of the yeast Saccharomyces cerevisiae , 1999, Bioinform..

[5]  I. Howald,et al.  TOR2 is part of two related signaling pathways coordinating cell growth in Saccharomyces cerevisiae. , 1998, Genetics.

[6]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[7]  Matthew A. Hibbs,et al.  Discovery of biological networks from diverse functional genomic data , 2005, Genome Biology.

[8]  G. Cagney,et al.  Methylation of Histone H3 by Set2 in Saccharomyces cerevisiae Is Linked to Transcriptional Elongation by RNA Polymerase II , 2003, Molecular and Cellular Biology.

[9]  A. Owen,et al.  A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae) , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[10]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[11]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[12]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Emden R. Gansner,et al.  An open graph visualization system and its applications to software engineering , 2000, Softw. Pract. Exp..

[14]  Curtis Huttenhower,et al.  Bayesian data integration: a functional perspective. , 2006, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[15]  I. Herskowitz,et al.  Unique and redundant roles for HOG MAPK pathway components as revealed by whole-genome expression analysis. , 2003, Molecular biology of the cell.

[16]  Juha-Pekka Pitkänen,et al.  Excess Mannose Limits the Growth of Phosphomannose Isomerase PMI40 Deletion Strain of Saccharomyces cerevisiae*[boxs] , 2004, Journal of Biological Chemistry.

[17]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[18]  G. Sumara,et al.  A Probabilistic Functional Network of Yeast Genes , 2004 .

[19]  George M. Church,et al.  Regulatory Networks Revealed by Transcriptional Profiling of Damaged Saccharomyces cerevisiae Cells: Rpn4 Links Base Excision Repair with Proteasomes , 2000, Molecular and Cellular Biology.

[20]  Matthew A. Hibbs,et al.  Finding function: evaluation methods for functional genomic data , 2006, BMC Genomics.

[21]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[22]  Olga G. Troyanskaya,et al.  A scalable method for integration and functional analysis of multiple microarray datasets , 2006, Bioinform..

[23]  C. Wijmenga,et al.  Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. , 2006, American journal of human genetics.

[24]  Ian M. Donaldson,et al.  The Biomolecular Interaction Network Database and related tools 2005 update , 2004, Nucleic Acids Res..

[25]  Ronald W. Davis,et al.  Functional profiling of the Saccharomyces cerevisiae genome , 2002, Nature.

[26]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[27]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[28]  Gary D Bader,et al.  Global Mapping of the Yeast Genetic Interaction Network , 2004, Science.

[29]  Matthew J. Brauer,et al.  Coordination of growth rate, cell cycle, stress response, and metabolic activity in yeast. , 2008, Molecular biology of the cell.

[30]  S. Kasif,et al.  Whole-genome annotation by using evidence integration in functional-linkage networks. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[32]  Kai Li,et al.  Exploring the functional landscape of gene expression: directed search of large microarray compendia , 2007, Bioinform..

[33]  Ned S. Wingreen,et al.  Finding regulatory modules through large-scale gene-expression data analysis , 2003, Bioinform..

[34]  F. David The moments of the z and F distributions. , 1949, Biometrika.

[35]  Kara Dolinski,et al.  Homeostatic adjustment and metabolic remodeling in glucose-limited yeast cultures. , 2005, Molecular biology of the cell.

[36]  Dietmar E. Martin,et al.  Rank Difference Analysis of Microarrays (RDAM), a novel approach to statistical analysis of microarray expression profiling data , 2004, BMC Bioinformatics.

[37]  C. Ball,et al.  Saccharomyces Genome Database. , 2002, Methods in enzymology.

[38]  T. Hughes,et al.  High-definition macromolecular composition of yeast RNA-processing complexes. , 2004, Molecular cell.

[39]  B. Pugh,et al.  Interplay of TBP inhibitors in global transcriptional control. , 2002, Molecular cell.

[40]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[41]  Shailesh V. Date,et al.  A Probabilistic Functional Network of Yeast Genes , 2004, Science.

[42]  Roger E Bumgarner,et al.  Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. , 2001, Science.

[43]  D. Shore,et al.  Growth-regulated recruitment of the essential yeast ribosomal protein gene activator Ifh1 , 2004, Nature.

[44]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[45]  Mariusz Olczak,et al.  Chitin Synthesis in Saccharomyces cerevisiae in Response to Supplementation of Growth Medium with Glucosamine and Cell Wall Stress , 2003, Eukaryotic Cell.

[46]  Huiming Ding,et al.  The synthetic genetic interaction spectrum of essential genes , 2005, Nature Genetics.

[47]  G. Cagney,et al.  Methylation of Histone H3 by Set2 in Saccharomyces cerevisiae Is Linked to Transcriptional Elongation by RNA Polymerase II , 2003, Molecular and Cellular Biology.

[48]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[49]  Marek J. Druzdzel,et al.  SMILE: Structural Modeling, Inference, and Learning Engine and GeNIE: A Development Environment for Graphical Decision-Theoretic Models , 1999, AAAI/IAAI.

[50]  Christoffer Bro,et al.  Transcriptional, Proteomic, and Metabolic Responses to Lithium in Galactose-grown Yeast Cells* , 2003, Journal of Biological Chemistry.

[51]  Andrew Emili,et al.  Navigating the Chaperone Network: An Integrative Map of Physical and Genetic Interactions Mediated by the Hsp90 Chaperone , 2005, Cell.

[52]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[53]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[54]  Rachel B. Brem,et al.  Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors , 2003, Nature Genetics.