Extremely intron-rich genes in the alveolate ancestors inferred with a flexible maximum-likelihood approach.

Chromalveolates are a large, diverse supergroup of unicellular eukaryotes that includes Apicomplexa, dinoflagellates, ciliates (three lineages that form the alveolate branch), heterokonts, haptophytes, and cryptomonads (three lineages comprising the chromist branch). All sequenced genomes of chromalveolates have relatively low intron density in protein-coding genes, and few intron positions are shared between chromalveolate lineages. In contrast, genes of different chromalveolates share many intron positions with orthologous genes from other eukaryotic supergroups, in particular, the intron-rich orthologs from animals and plants. Reconstruction of the history of intron gain and loss during the evolution of chromalveolates using a general and flexible maximum-likelihood approach indicates that genes of the ancestors of chromalveolates and, particularly, alveolates had unexpectedly high intron densities. It is estimated that the chromalveolate ancestor had, approximately, two-third of the human intron density, whereas the intron density in the genes of the alveolate ancestor is estimated to be slightly greater than the human intron density. Accordingly, it is inferred that the evolution of chromalveolates was dominated by intron loss. The conclusion that ancestral chromalveolate forms had high intron densities is unexpected because all extant unicellular eukaryotes have relatively few introns and are thought to be unable to maintain numerous introns due to intense purifying selection in their, typically, large populations. It is suggested that, at early stages of evolution, chromalveolates went through major population bottlenecks that were accompanied by intron invasion.

[1]  Andrew G McArthur,et al.  A spliceosomal intron in Giardia lamblia , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[2]  M. Long,et al.  Intron-exon structures of eukaryotic model organisms. , 1999, Nucleic acids research.

[3]  M. Steel Recovering a tree from the leaf colourations it generates under a Markov model , 1994 .

[4]  Thomas L. Madden,et al.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. , 2001, Nucleic acids research.

[5]  S. Adl,et al.  The New Higher Level Classification of Eukaryotes with Emphasis on the Taxonomy of Protists , 2005, The Journal of eukaryotic microbiology.

[6]  J. A. Studier,et al.  A note on the neighbor-joining algorithm of Saitou and Nei. , 1988, Molecular biology and evolution.

[7]  David Penny,et al.  Widespread intron loss suggests retrotransposon activity in ancient apicomplexans. , 2007, Molecular biology and evolution.

[8]  Masami Hasegawa,et al.  Root of the Eukaryota tree as inferred from combined maximum likelihood analyses of multiple molecular sequence data. , 2005, Molecular biology and evolution.

[9]  W. Gilbert Why genes in pieces? , 1978, Nature.

[10]  Lesley Collins,et al.  Complex spliceosomal organization ancestral to extant eukaryotes. , 2005, Molecular biology and evolution.

[11]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[12]  Michael A. Charleston,et al.  Reconciled trees and incongruent gene and species trees , 1996, Mathematical Hierarchies and Biology.

[13]  M. Csűrös Likely scenarios of intron evolution , 2005, RECOMB 2005.

[14]  M. Lynch The origins of eukaryotic gene structure. , 2006, Molecular biology and evolution.

[15]  Joseph Felsenstein,et al.  PHYLOGENIES FROM RESTRICTION SITES: A MAXIMUM‐LIKELIHOOD APPROACH , 1992, Evolution; international journal of organic evolution.

[16]  D. Penny,et al.  Large-scale intron conservation and order-of-magnitude variation in intron loss/gain rates in apicomplexan evolution. , 2006, Genome research.

[17]  E. Koonin The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate? , 2006, Biology Direct.

[18]  D. Roos,et al.  Nuclear-encoded, plastid-targeted genes suggest a single common origin for apicomplexan and dinoflagellate plastids. , 2001, Molecular biology and evolution.

[19]  W. Gilbert,et al.  Rates of intron loss and gain: implications for early eukaryotic evolution. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Patrick J. Keeling,et al.  Deep Questions in the Tree of Life , 2007, Science.

[21]  A. Lambowitz,et al.  Mobile group II introns. , 2004, Annual review of genetics.

[22]  K. Crandall,et al.  Phylogeny Estimation and Hypothesis Testing Using Maximum Likelihood , 1997 .

[23]  H. Le Hir,et al.  How introns influence and enhance eukaryotic gene expression. , 2003, Trends in biochemical sciences.

[24]  Erik L. L. Sonnhammer,et al.  Scoredist: A simple and robust protein sequence distance estimator , 2005, BMC Bioinformatics.

[25]  T. Cavalier-smith,et al.  Rooting the Eukaryote Tree by Using a Derived Gene Fusion , 2002, Science.

[26]  Igor B. Rogozin,et al.  In search of lost introns , 2007, ISMB/ECCB.

[27]  Dannie Durand,et al.  NOTUNG: A Program for Dating Gene Duplications and Optimizing Gene Family Trees , 2000, J. Comput. Biol..

[28]  A. Simpson,et al.  Eukaryotic evolution: Early origin of canonical introns , 2002, Nature.

[29]  P. Keeling,et al.  Re-examining Alveolate Evolution Using Multiple Protein Molecular Phylogenies , 2002, The Journal of eukaryotic microbiology.

[30]  Bengt Sennblad,et al.  Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution , 2004, RECOMB.

[31]  D. Hartl,et al.  Very little intron loss/gain in Plasmodium: intron loss/gain mutation rates and intron number. , 2006, Genome research.

[32]  J D Palmer,et al.  Seven newly discovered intron positions in the triose-phosphate isomerase gene: evidence for the introns-late theory. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[33]  M. Zuker,et al.  Testing the exon theory of genes: the evidence from protein structure. , 1994, Science.

[34]  Walter Gilbert,et al.  Complex early genes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[35]  S J de Souza,et al.  Origin of genes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[36]  P. Keeling,et al.  On the monophyly of chromalveolates using a six-protein phylogeny of eukaryotes. , 2005, International journal of systematic and evolutionary microbiology.

[37]  Tobias Mourier,et al.  Eukaryotic Intron Loss , 2003, Science.

[38]  Hung D. Nguyen,et al.  The evolution of spliceosomal introns in alveolates. , 2007, Molecular biology and evolution.

[39]  E. Koonin,et al.  Patterns of intron gain and conservation in eukaryotic genes , 2007, BMC Evolutionary Biology.

[40]  D Chambers,et al.  GETTING TO THE ROOT OF THE PROBLEM , 2000 .

[41]  W. Gilbert,et al.  On the ancient nature of introns. , 1993, Gene.

[42]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[43]  Eugene V Koonin,et al.  A glimpse of a putative pre-intron phase of eukaryotic evolution. , 2007, Trends in genetics : TIG.

[44]  J. Logsdon,et al.  The recent origins of spliceosomal introns revisited. , 1998, Current opinion in genetics & development.

[45]  Walter Gilbert,et al.  The evolution of spliceosomal introns: patterns, puzzles and progress , 2006, Nature Reviews Genetics.

[46]  J. Mattick,et al.  Introns: evolution and function. , 1994, Current opinion in genetics & development.

[47]  T. Cavalier-smith Principles of Protein and Lipid Targeting in Secondary Symbiogenesis: Euglenoid, Dinoflagellate, and Sporozoan Plastid Origins and the Eukaryote Family Tree 1 , 2 , 1999, The Journal of eukaryotic microbiology.

[48]  Alexei Fedorov,et al.  Large-scale comparison of intron positions among animal, plant, and fungal genes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Igor B. Rogozin,et al.  Analysis of evolution of exon-intron structure of eukaryotic genes , 2005, Briefings Bioinform..

[50]  W. Ford Doolittle,et al.  Genes in pieces: were they ever together? , 1978, Nature.

[51]  Narmada Thanki,et al.  CDD: a conserved domain database for interactive domain family analysis , 2006, Nucleic Acids Res..

[52]  J. Archibald,et al.  Jumping Genes and Shrinking Genomes ‐ Probing the Evolution of Eukaryotic Photosynthesis with Genomics , 2005, IUBMB life.

[53]  P. Keeling,et al.  Genomics. Deep questions in the tree of life. , 2007, Science.

[54]  Russell F. Doolittle,et al.  Intron Distribution in Ancient Paralogs Supports Random Insertion and Not Random Loss , 1997, Journal of Molecular Evolution.

[55]  G. Moore,et al.  Fitting the gene lineage into its species lineage , 1979 .

[56]  R. Spang,et al.  Estimating amino acid substitution models: a comparison of Dayhoff's estimator, the resolvent approach and a maximum likelihood method. , 2002, Molecular biology and evolution.

[57]  A. Rose The effect of intron location on intron-mediated enhancement of gene expression in Arabidopsis. , 2004, The Plant journal : for cell and molecular biology.

[58]  P. Keeling,et al.  Nucleus-Encoded, Plastid-Targeted Glyceraldehyde-3-Phosphate Dehydrogenase (GAPDH) Indicates a Single Origin for Chromalveolate Plastids , 2003 .

[59]  J. Palmer,et al.  Phylogeny: Parabasalian flagellates are ancient eukaryotes , 2000, Nature.

[60]  J. Carlton,et al.  Spliceosomal introns in the deep-branching eukaryote Trichomonas vaginalis. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Hung D. Nguyen,et al.  New Maximum Likelihood Estimators for Eukaryotic Intron Evolution , 2005, PLoS Comput. Biol..

[62]  M. Lynch,et al.  The Origins of Genome Complexity , 2003, Science.

[63]  V. Bourdon,et al.  Introns and their positions affect the translational activity of mRNA in plant cells , 2001, EMBO Reports.

[64]  Eugene V. Koonin,et al.  Introns and the origin of nucleus–cytosol compartmentalization , 2006, Nature.

[65]  E. Koonin,et al.  Conservation versus parallel gains in intron evolution , 2005, Nucleic acids research.

[66]  S. Ying,et al.  Intronic microRNAs. , 2005, Biochemical and biophysical research communications.

[67]  T. Cavalier-smith,et al.  Chromalveolate diversity and cell megaevolution: interplay of membranes, genomes and cytoskeleton. , 2004 .

[68]  E. Koonin,et al.  Remarkable Interkingdom Conservation of Intron Positions and Massive, Lineage-Specific Intron Loss and Gain in Eukaryotic Evolution , 2003, Current Biology.

[69]  B Franz Lang,et al.  The tree of eukaryotes. , 2005, Trends in ecology & evolution.

[70]  Laura Wegener Parfrey,et al.  Evaluating Support for the Current Classification of Eukaryotic Diversity , 2006, PLoS genetics.

[71]  Scott W Roy,et al.  Intron-rich ancestors. , 2006, Trends in genetics : TIG.

[72]  Sean R. Eddy,et al.  A simple algorithm to infer gene duplication and speciation events on a gene tree , 2001, Bioinform..

[73]  E. Koonin,et al.  Three distinct modes of intron dynamics in the evolution of eukaryotes. , 2007, Genome research.

[74]  Michael Lynch,et al.  The evolution of spliceosomal introns. , 2002, Current opinion in genetics & development.

[75]  D. Penny,et al.  The biology of intron gain and loss. , 2006, Trends in genetics : TIG.

[76]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[77]  D. Penny,et al.  A very high fraction of unique intron positions in the intron-rich diatom Thalassiosira pseudonana indicates widespread intron gain. , 2007, Molecular biology and evolution.

[78]  E. Koonin Orthologs, Paralogs, and Evolutionary Genomics 1 , 2005 .

[79]  E. Koonin Orthologs, paralogs, and evolutionary genomics. , 2005, Annual review of genetics.

[80]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.