Borrelia burgdorferi Clostridium acetobutyliam Chlamydophila pneumoniae Chlamydia trachomatis Deinococcus radiodurans Escherichia coli Helicobacter pylori Leishmania major Methanococcus jannaschii Mycobacterium tuberculosis Neisseria meningitidis Pseudomonas aeruginosa Porphyromonas ginvivalis Pyroc

A pair of distinct proteins in one organism may most closely match different parts of the same protein in another organism. A comparison of all proteins from the genome of Saccharomyces cerevisiae and all proteins from 24 prokaryotic genomes yields 1010 pairs of yeast proteins whose homologs are parts of one protein from a prokaryotic genome. Marcotte et al. (Science 285:751-3) showed that proteins related in this manner are more likely to interact than proteins chosen at random. In this paper, we investigated whether genes coding for such proteins are also likely to be concurrently transcribed. We identified 1010 fused pairs of proteins encoded in the yeast genome and analyzed expression of the corresponding genes at the transcriptional level. We found that the transcriptional profiles of fused gene pairs are significantly closer than those of randomly selected pairs. This finding is reproducible and established by multiple distance metrics. Moreover, such pairs frequently share additional biologically relevant properties. Thus, while protein fusion patterns are not predictive of co-expression, they are an important element in explaining co-expression. This justifies the use of curated protein fusion events to help characterize gene co-expression clusters.

[1]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[2]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[3]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[4]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 2000, Nucleic Acids Res..

[5]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[6]  D. Eisenberg,et al.  Protein function in the post-genomic era , 2000, Nature.

[7]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[8]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[9]  Kara Dolinski,et al.  Integrating functional genomic information into the Saccharomyces Genome Database , 2000, Nucleic Acids Res..

[10]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[11]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  David Botstein,et al.  The Stanford Microarray Database , 2001, Nucleic Acids Res..

[13]  A. Brivanlou,et al.  Microarray-based analysis of early development in Xenopus laevis. , 2001, Developmental biology.

[14]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[15]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  E. Winzeler,et al.  Genomics, gene expression and DNA arrays , 2000, Nature.

[17]  P. Uetz,et al.  Systematic and large-scale two-hybrid screens. , 2000, Current opinion in microbiology.

[18]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.