Analysis of evolution of exon-intron structure of eukaryotic genes

The availability of multiple, complete eukaryotic genome sequences allows one to address many fundamental evolutionary questions on genome scale. One such important, long-standing problem is evolution of exon-intron structure of eukaryotic genes. Analysis of orthologous genes from completely sequenced genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists. The data on shared and lineage-specific intron positions were used as the starting point for evolutionary reconstruction with parsimony and maximum-likelihood approaches. Parsimony methods produce reconstructions with intron-rich ancestors but also infer lineage-specific, in many cases, high levels of intron loss and gain. Different probabilistic models gave opposite results, apparently depending on model parameters and assumptions, from domination of intron loss, with extremely intron-rich ancestors, to dramatic excess of gains, to the point of denying any true conservation of intron positions among deep eukaryotic lineages. Development of models with adequate, realistic parameters and assumptions seems to be crucial for obtaining more definitive estimates of intron gain and loss in different eukaryotic lineages. Many shared intron positions were detected in ancestral eukaryotic paralogues which evolved by duplication prior to the divergence of extant eukaryotic lineages. These findings indicate that numerous introns were present in eukaryotic genes already at the earliest stages of evolution of eukaryotes and are compatible with the hypothesis that the original, catastrophic intron invasion accompanied the emergence of the eukaryotic cells. Comparison of various features of old and younger introns starts shedding light on probable mechanisms of intron insertion, indicating that propagation of old introns is unlikely to be a major mechanism for origin of new ones. The existence and structure of ancestral protosplice sites were addressed by examining the context of introns inserted within codons that encode amino acids conserved in all eukaryotes and, accordingly, are not subject to selection for splicing efficiency. It was shown that introns indeed predominantly insert into or are fixed in specific protosplice sites which have the consensus sequence (A/C)AG|Gt.

[1]  Jackson Ij,et al.  A reappraisal of non-consensus mRNA splice sites. , 1991 .

[2]  Jodie J. Yin,et al.  A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes , 2004, Genome Biology.

[3]  L. Patthy Modular Assembly of Genes and the Evolution of New Functions , 2003, Genetica.

[4]  Mona Singh,et al.  A novel method for estimating ancestral amino acid composition and its application to proteins of the Last Universal Ancestor , 2004, Bioinform..

[5]  K. H. Wolfe,et al.  Origins of recently gained introns in Caenorhabditis. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[6]  L. Patthy Genome evolution and the evolution of exon-shuffling--a review. , 1999, Gene.

[7]  R. Padgett,et al.  Conserved sequences in a class of rare eukaryotic nuclear introns with non-consensus splice sites. , 1994, Journal of molecular biology.

[8]  M. Tomita,et al.  On biased distribution of introns in various eukaryotes. , 2002, Gene.

[9]  D. Haussler,et al.  Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. , 2003, Molecular biology and evolution.

[10]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[11]  G. M. Suboch,et al.  Analysis of nonuniformity in intron phase distribution. , 1992, Nucleic acids research.

[12]  A. Newman,et al.  Exon Junction Sequences as Cryptic Splice Sites Implications for Intron Origin , 2004, Current Biology.

[13]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[14]  D. Tautz,et al.  Of statistics and genomes. , 2004, Trends in genetics : TIG.

[15]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[16]  L. D. Hurst,et al.  Can Codon Usage Bias Explain Intron Phase Distributions and Exon Symmetry? , 2004, Journal of Molecular Evolution.

[17]  W. Gilbert,et al.  Resolution of a deep animal divergence by the pattern of intron conservation. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[18]  S J de Souza,et al.  Toward a resolution of the introns early/late debate: only phase zero introns are correlated with the structure of ancient proteins. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[19]  M. Lynch,et al.  The Origins of Genome Complexity , 2003, Science.

[20]  W. Gilbert,et al.  On the ancient nature of introns. , 1993, Gene.

[21]  Tal Pupko,et al.  A structural EM algorithm for phylogenetic inference , 2001, J. Comput. Biol..

[22]  R. Ellis,et al.  A phylogeny of caenorhabditis reveals frequent loss of introns during nematode evolution. , 2004, Genome research.

[23]  B. Danforth,et al.  Elongation factor-1 alpha occurs as two copies in bees: implications for phylogenetic analysis of EF-1 alpha sequences in insects. , 1998, Molecular biology and evolution.

[24]  M. Long,et al.  Testing the "proto-splice sites" model of intron origin: evidence from analysis of intron phase correlations. , 2000, Molecular biology and evolution.

[25]  J. Vaughn,et al.  The Evolution of Single-Copy Drosophila Nuclear 4f-rnp Genes: Spliceosomal Intron Losses Create Polymorphic Alleles , 2002, Journal of Molecular Evolution.

[26]  M. Zuker,et al.  Testing the exon theory of genes: the evidence from protein structure. , 1994, Science.

[27]  Tobias Mourier,et al.  Eukaryotic Intron Loss , 2003, Science.

[28]  S. Blair Hedges,et al.  The origin and evolution of model organisms , 2002, Nature Reviews Genetics.

[29]  W. Gilbert,et al.  Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[30]  F. Ayala,et al.  A new Drosophila spliceosomal intron position is common in plants , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[31]  M. Nei,et al.  Molecular Evolution and Phylogenetics , 2000 .

[32]  S. Brenner,et al.  Late changes in spliceosomal introns define clades in vertebrate evolution. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[33]  R. Raff,et al.  Evidence for a clade of nematodes, arthropods and other moulting animals , 1997, Nature.

[34]  Gil Ast,et al.  Comparative analysis detects dependencies among the 5' splice-site positions. , 2004, RNA.

[35]  K. Peterson,et al.  Animal phylogeny and the ancestry of bilaterians: inferences from morphology and 18S rDNA gene sequences , 2001, Evolution & development.

[36]  J D Palmer,et al.  Intron "sliding" and the diversity of intron positions. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Russell F. Doolittle,et al.  Intron Distribution in Ancient Paralogs Supports Random Insertion and Not Random Loss , 1997, Journal of Molecular Evolution.

[38]  T. Maniatis,et al.  An extensive network of coupling among gene expression machines , 2002, Nature.

[39]  H Philippe,et al.  Opinion: long branch attraction and protist phylogeny. , 2000, Protist.

[40]  Arlin Stoltzfus,et al.  The evolutionary gain of spliceosomal introns: sequence and phase preferences. , 2004, Molecular biology and evolution.

[41]  Veiko Krauss,et al.  Phylogenetic mapping of intron positions: a case study of translation initiation factor eIF2gamma. , 2005, Molecular biology and evolution.

[42]  M. Rosbash,et al.  The U1 snRNP protein U1C recognizes the 5′ splice site in the absence of base pairing , 2002, Nature.

[43]  A. Grigoriev,et al.  Significant expansion of exon-bordering protein domains during animal proteome evolution , 2005, Nucleic acids research.

[44]  J. Logsdon,et al.  The recent origins of spliceosomal introns revisited. , 1998, Current opinion in genetics & development.

[45]  Luciano Milanesi,et al.  Prediction and Phylogenetic Analysis of Mammalian Short Interspersed Elements (SINEs) , 2000, Briefings Bioinform..

[46]  L K Derr,et al.  The involvement of cellular recombination and repair genes in RNA-mediated recombination in Saccharomyces cerevisiae. , 1998, Genetics.

[47]  A. Simpson,et al.  Eukaryotic evolution: Early origin of canonical introns , 2002, Nature.

[48]  László Patthy,et al.  Exons – original building blocks of proteins? , 1991, BioEssays : news and reviews in molecular, cellular and developmental biology.

[49]  C. Guthrie,et al.  A novel role for a U5 snRNP protein in 3' splice site selection. , 1995, Genes & development.

[50]  Walter Gilbert,et al.  Complex early genes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[51]  I. Jackson,et al.  A reappraisal of non-consensus mRNA splice sites. , 1991, Nucleic acids research.

[52]  W. Gilbert,et al.  The exon theory of genes. , 1987, Cold Spring Harbor symposia on quantitative biology.

[53]  Ian Holmes,et al.  Using evolutionary Expectation Maximization to estimate indel rates , 2005, Bioinform..

[54]  Lesley Collins,et al.  Complex spliceosomal organization ancestral to extant eukaryotes. , 2005, Molecular biology and evolution.

[55]  E. Koonin,et al.  Intron sliding in conserved gene families. , 2000, Trends in genetics : TIG.

[56]  B. Séraphin,et al.  Who's on first? The U1 snRNP-5' splice site interaction and splicing. , 1991, Trends in biochemical sciences.

[57]  Michael Lynch,et al.  The evolution of spliceosomal introns. , 2002, Current opinion in genetics & development.

[58]  N. Dibb,et al.  Proto-splice site model of intron origin. , 1991, Journal of theoretical biology.

[59]  Eugene V Koonin,et al.  Prevalence of intron gain over intron loss in the evolution of paralogous gene families. , 2004, Nucleic acids research.

[60]  J. McInerney,et al.  The Opisthokonta and the Ecdysozoa may not be clades: stronger support for the grouping of plant and animal than for animal and fungi and stronger support for the Coelomata than Ecdysozoa. , 2005, Molecular biology and evolution.

[61]  T. Gojobori,et al.  Bmc Evolutionary Biology the Evolutionary Position of Nematodes , 2022 .

[62]  E. Koonin,et al.  Coelomata and not Ecdysozoa: evidence from genome-wide phylogenetic analysis. , 2003, Genome research.

[63]  A Yoshida,et al.  Exon/intron structure of aldehyde dehydrogenase genes supports the "introns-late" theory. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[64]  Arlin Stoltzfus,et al.  Molecular evolution: Recent cases of spliceosomal intron gain? , 1998, Current Biology.

[65]  Walter Gilbert,et al.  The triosephosphate isomerase gene from maize introns antedate the plant-animal divergence , 1986, Cell.

[66]  G. Fink,et al.  Pseudogenes in yeast? , 1987, Cell.

[67]  J. Carlton,et al.  Spliceosomal introns in the deep-branching eukaryote Trichomonas vaginalis. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[68]  L. Patthy,et al.  Intron‐dependent evolution: Preferred types of exons and introns , 1987, FEBS letters.

[69]  Melissa S Jurica,et al.  Pre-mRNA splicing: awash in a sea of proteins. , 2003, Molecular cell.

[70]  Alexei Fedorov,et al.  Introns in gene evolution. , 2003 .

[71]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[72]  E. Koonin,et al.  Conservation versus parallel gains in intron evolution , 2005, Nucleic acids research.

[73]  L. Hood,et al.  Gene families: the taxonomy of protein paralogs and chimeras. , 1997, Science.

[74]  W. Doolittle,et al.  The chaperonin genes of jakobid and jakobid-like flagellates: implications for eukaryotic evolution. , 2002, Molecular biology and evolution.

[75]  A. Newman,et al.  Evidence that introns arose at proto‐splice sites. , 1989, The EMBO journal.

[76]  J. W. Valentine,et al.  The significance of moulting in Ecdysozoan evolution , 2000, Evolution & development.

[77]  M. Lynch Intron evolution as a population-genetic process , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[78]  Walter Gilbert,et al.  The pattern of intron loss. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[79]  E. Koonin,et al.  The role of lineage-specific gene family expansion in the evolution of eukaryotes. , 2002, Genome research.

[80]  P. Sharp,et al.  Evolutionary fates and origins of U12-type introns. , 1998, Molecular cell.

[81]  Andrew G McArthur,et al.  A spliceosomal intron in Giardia lamblia , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[82]  E. Koonin,et al.  Remarkable Interkingdom Conservation of Intron Positions and Massive, Lineage-Specific Intron Loss and Gain in Eukaryotic Evolution , 2003, Current Biology.

[83]  S. Aubourg,et al.  Evolution of intron/exon structure of DEAD helicase family genes in Arabidopsis, Caenorhabditis, and Drosophila. , 2001, Genome research.

[84]  T. Nilsen The spliceosome: the most complex macromolecular machine in the cell? , 2003, BioEssays : news and reviews in molecular, cellular and developmental biology.

[85]  I. Ebersberger,et al.  A variable intron distribution in globin genes of Chironomus: evidence for recent intron gain. , 1997, Gene.

[86]  J D Palmer,et al.  Seven newly discovered intron positions in the triose-phosphate isomerase gene: evidence for the introns-late theory. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[87]  M. Long,et al.  Association of intron phases with conservation at splice site sequences and evolution of spliceosomal introns. , 1999, Molecular biology and evolution.

[88]  J. Farris Phylogenetic Analysis Under Dollo's Law , 1977 .

[89]  Igor B. Rogozin,et al.  Evidence of Splice Signal Migration from Exon to Intron during Intron Evolution , 2003, Current Biology.

[90]  Igor B. Rogozin,et al.  Reconstruction of Ancestral Protosplice Sites , 2004, Current Biology.

[91]  Eugene V Koonin,et al.  Preferential loss and gain of introns in 3' portions of genes suggests a reverse-transcription mechanism of intron insertion. , 2004, Gene.

[92]  Alexei Fedorov,et al.  Large-scale comparison of intron positions among animal, plant, and fungal genes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[93]  R. Reed,et al.  Evidence that U 5 snRNP recognizes the 3 9 splice site for catalytic step II in mammals sites in pre-mRNA very near or at the catalytic center , 1997 .