The role of introns in repeat protein gene formation.

Genes composed of tandem repetitive sequence motifs are abundant in nature and are enriched in eukaryotes. To investigate repeat protein gene formation mechanisms, we have conducted a large-scale analysis of their introns and exons. We find that a wide variety of repeat motifs exhibit a striking conservation of intron position and phase, and are composed of exons that encode one or two complete repeats. These results suggest a simple model of repeat protein gene formation from local duplications. This model is corroborated by amino acid sequence similarity patterns among neighboring repeats from various repeat protein genes. The distribution of one- and two-repeat exons indicates that intron-facilitated repeat motif duplication, in which the start and end points of duplication are located in consecutive intronic regions, significantly exceeds intron-independent duplication. These results suggest that introns have contributed to the greater abundance of repeat protein genes in eukaryotic versus prokaryotic organisms, a conclusion that is supported by taxonomic analysis.

[1]  A. Kajava Structural diversity of leucine-rich repeat proteins. , 1998, Journal of molecular biology.

[2]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[3]  S J de Souza,et al.  Intron positions correlate with module boundaries in ancient proteins. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Christopher A. Voigt,et al.  Protein building blocks preserved by recombination , 2002, Nature Structural Biology.

[5]  T. Cavalier-smith,et al.  Intron phylogeny: a new hypothesis. , 1991, Trends in genetics : TIG.

[6]  K. Diederichs,et al.  Crystal structure of human β2‐glycoprotein I: implications for phospholipid binding and the antiphospholipid syndrome , 1999 .

[7]  B. Kobe,et al.  Assessment of the ability to model proteins with leucine‐rich repeats in light of the latest structural information , 2002, Protein science : a publication of the Protein Society.

[8]  D. Eisenberg,et al.  A census of protein repeats. , 1999, Journal of molecular biology.

[9]  Doug Barrick,et al.  An experimentally determined protein folding energy landscape. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[10]  P. Bork Hundreds of ankyrin‐like repeats in functionally diverse proteins: Mobile modules that cross phyla horizontally? , 1993, Proteins.

[11]  I. Campbell,et al.  Solution Structure of a Pair of Calcium-Binding Epidermal Growth Factor-like Domains: Implications for the Marfan Syndrome and Other Genetic Disorders , 1996, Cell.

[12]  M. Delseny,et al.  The EMB 506 gene encodes a novel ankyrin repeat containing protein that is essential for the normal development of Arabidopsis embryos. , 1999, The Plant journal : for cell and molecular biology.

[13]  Bostjan Kobe,et al.  Crystal structure of porcine ribonuclease inhibitor, a protein with leucine-rich repeats , 1993, Nature.

[14]  P. Zamore,et al.  Crystal structure of a Pumilio homology domain. , 2001, Molecular cell.

[15]  Tommi Kajander,et al.  A new folding paradigm for repeat proteins. , 2005, Journal of the American Chemical Society.

[16]  S J de Souza,et al.  Origin of genes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Daniel C. Desrosiers,et al.  The ankyrin repeat as molecular architecture for protein recognition , 2004, Protein science : a publication of the Protein Society.

[18]  D. Wigley,et al.  Structure of the zinc-binding domain of Bacillus stearothermophilus DNA primase. , 2000, Structure.

[19]  Meena Kishore Sakharkar,et al.  ExInt: an Exon Intron Database , 2002, Nucleic Acids Res..

[20]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[21]  L. Nelson,et al.  A human homologue of mouse Mater, a maternal effect gene essential for early embryonic development. , 2002, Human reproduction.

[22]  B. Forget,et al.  Structure and Organization of the Human Ankyrin-1 Gene , 1997, The Journal of Biological Chemistry.

[23]  W. Ford Doolittle,et al.  Genes in pieces: were they ever together? , 1978, Nature.

[24]  L. Duret,et al.  Why do genes have introns? Recombination might add a new piece to the puzzle. , 2001, Trends in genetics : TIG.