Simple sequence is abundant in eukaryotic proteins.

All proteins of Saccharomyces cerevisiae have been compared to determine how frequently segments from one protein are present in other proteins. Proteins that are recently evolutionarily related were excluded. The most frequently present protein segments are long, tandem repetitions of a single amino acid. For some of these segments, up to 14% of all proteins in the genome were found to have similar peptides within them. These peptide segments may not be functional protein domains. Although they are the most common shared feature of yeast proteins, their ubiquity and simplicity argue that their probable function may be to simply serve as spacers between other protein motifs.

[1]  L. Alberghina,et al.  O-linked oligosaccharides in yeast glycosyl phosphatidylinositol-anchored protein gp115 are clustered in a serine-rich region not essential for its function. , 1994, The Journal of biological chemistry.

[2]  André Goffeau,et al.  The yeast genome directory. , 1997, Nature.

[3]  L. Alberghina,et al.  Isolation and deduced amino acid sequence of the gene encoding gp115, a yeast glycophospholipid-anchored protein containing a serine-rich region. , 1991, The Journal of biological chemistry.

[4]  T. Fujikawa,et al.  Structures of mollusc shell framework proteins , 1997, Nature.

[5]  S. Ohno Early genes that were oligomeric repeats generated a number of divergent domains on their own. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Y. Lin,et al.  Molecular cloning and characterization of winter flounder antifreeze cDNA. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[7]  W. Gilbert Why genes in pieces? , 1978, Nature.

[8]  S. Emr,et al.  Novel PI(4)P 5-kinase homologue, Fab1p, essential for normal vacuole function and morphology in yeast. , 1995, Molecular biology of the cell.

[9]  T. Creighton Proteins: Structures and Molecular Properties , 1986 .

[10]  K. H. Wolfe,et al.  Molecular evidence for an ancient duplication of the entire yeast genome , 1997, Nature.

[11]  R. Doolittle The multiplicity of domains in proteins. , 1995, Annual review of biochemistry.

[12]  K. Arndt,et al.  Overexpression of SIS2, which contains an extremely acidic region, increases the expression of SWI4, CLN1 and CLN2 in sit4 mutants. , 1995, Genetics.

[13]  S. Persengiev,et al.  Characterization of a cDNA containing trinucleotide repeat sequences that is highly enriched in spermatogenic cells , 1997, Molecular reproduction and development.

[14]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[15]  D. Duboule,et al.  DNA sequences homologous to the Drosophila opa repeat are present in murine mRNAs that are differentially expressed in fetuses and adult tissues , 1987, Molecular and cellular biology.

[16]  W. Gilbert,et al.  How big is the universe of exons? , 1990, Science.

[17]  R. Pearlman,et al.  A germ line-specific sequence element in an intron in Tetrahymena thermophila. , 1994, The Journal of biological chemistry.

[18]  S J de Souza,et al.  Origin of genes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[19]  S J de Souza,et al.  Intron positions correlate with module boundaries in ancient proteins. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[20]  B. Barrell,et al.  Life with 6000 Genes , 1996, Science.

[21]  S. Henry,et al.  The OPI1 gene of Saccharomyces cerevisiae, a negative regulator of phospholipid biosynthesis, encodes a protein containing polyglutamine tracts and a leucine zipper. , 1991, The Journal of biological chemistry.

[22]  S. Oliver,et al.  Erratum: Overview of the yeast genome , 1997, Nature.

[23]  E. Young,et al.  The yeast ADR6 gene encodes homopolymeric amino acid sequences and a potential metal-binding domain. , 1988, Nucleic acids research.

[24]  J. Milbrandt,et al.  A nerve growth factor-induced gene encodes a possible transcriptional regulatory factor. , 1987, Science.

[25]  S. Anderson,et al.  Molecular Cloning and Characterization of a Novel Mouse Macrophage Gene That Encodes a Nuclear Protein Comprising Polyglutamine Repeats and Interspersing Histidines* , 1996, The Journal of Biological Chemistry.

[26]  S. Artavanis-Tsakonas,et al.  opa: A novel family of transcribed repeats shared by the Notch locus and other developmentally regulated loci in D. melanogaster , 1985, Cell.