A combined algorithm for genome-wide prediction of protein function

The availability of over 20 fully sequenced genomes has driven the development of new methods to find protein function and interactions. Here we group proteins by correlated evolution, correlated messenger RNA expression patterns and patterns of domain fusion to determine functional relationships among the 6,217 proteins of the yeast Saccharomyces cerevisiae. Using these methods, we discover over 93,000 pairwise links between functionally related yeast proteins. Links between characterized and uncharacterized proteins allow a general function to be assigned to more than half of the 2,557 previously uncharacterized yeast proteins. Examples of functional links are given for a protein family of previously unknown function, a protein whose human homologues are implicated in colon cancer and the yeast prion Sup35.

[1]  R. Dubos,et al.  VIRULENCE AND MORPHOLOGICAL CHARACTERISTICS OF MAMMALIAN TUBERCLE BACILLI , 1947, The Journal of experimental medicine.

[2]  P. Kolattukudy,et al.  Synthesis of mycocerosic acids from methylmalonyl coenzyme A by cell-free extracts of Mycobacterium tuberculosis var. bovis BCG. , 1983, The Journal of biological chemistry.

[3]  P. Brennan,et al.  Further specific extracellular phenolic glycolipid antigens and a related diacylphthiocerol from Mycobacterium leprae. , 1983, The Journal of biological chemistry.

[4]  A. Surguchov,et al.  Nucleotide sequence of the SUP2 (SUP35) gene of Saccharomyces cerevisiae. , 1988, Gene.

[5]  D. Hartl,et al.  Genetic applications of an inverse polymerase chain reaction. , 1988, Genetics.

[6]  N. Copeland,et al.  The human mutator gene homolog MSH2 and its association with hereditary nonpolyposis colon cancer , 1993, Cell.

[7]  R. Wickner,et al.  [URE3] as an altered URE2 protein: evidence for a prion analog in Saccharomyces cerevisiae. , 1994, Science.

[8]  X. Chen,et al.  Two yeast genes with similarity to TCP-1 are required for microtubule and actin function in vivo. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[9]  R. Fleischmann,et al.  Mutation of a mutL homolog in hereditary colon cancer. , 1994, Science.

[10]  I. Stansfield,et al.  The products of the SUP45 (eRF1) and SUP35 genes interact to mediate translation termination in Saccharomyces cerevisiae. , 1995, The EMBO journal.

[11]  Peter D. Karp,et al.  Eco Cyc: encyclopedia of Escherichia coli genes and metabolism , 1999, Nucleic Acids Res..

[12]  D R Appling,et al.  Metabolic role of cytoplasmic isozymes of 5,10-methylenetetrahydrofolate dehydrogenase in Saccharomyces cerevisiae. , 1996, Biochemistry.

[13]  Robert E. Johnson,et al.  Requirement of the Yeast MSH3 and MSH6 Genes for MSH2-dependent Genomic Stability (*) , 1996, The Journal of Biological Chemistry.

[14]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[15]  André Goffeau,et al.  The yeast genome directory. , 1997, Nature.

[16]  H. Lynch,et al.  Cancer Genetics in the New Era of Molecular Biology , 1997, Annals of the New York Academy of Sciences.

[17]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[18]  Paul Horton,et al.  Better Prediction of Protein Cellular Localization Sites with the it k Nearest Neighbors Classifier , 1997, ISMB.

[19]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[20]  M. Koike,et al.  Germline mutation of MSH6 as the cause of hereditary nonpolyposis colorectal cancer , 1997, Nature Genetics.

[21]  W. Jacobs,et al.  Conditionally replicating mycobacteriophages: a system for transposon delivery to Mycobacterium tuberculosis. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[22]  D. Botstein,et al.  The transcriptional program of sporulation in budding yeast. , 1998, Science.

[23]  Dmitrij Frishman,et al.  MIPS: a database for protein sequences and complete genomes , 1998, Nucleic Acids Res..

[24]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[25]  B. Snel,et al.  Conservation of gene order: a fingerprint of proteins that physically interact. , 1998, Trends in biochemical sciences.

[26]  A. Bairoch,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999 , 1999, Nucleic Acids Res..

[27]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[28]  P. Brown,et al.  Mediator protein mutations that selectively abolish activated transcription. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[29]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[30]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[31]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[33]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[34]  D. Cheresh URE 3 ] as an Altered URE 2 Protein : Evidence for a Prion Analog in Saccharomyces cerevisiae , 2022 .