Large-scale comparison of intron positions among animal, plant, and fungal genes

We purge large databases of animal, plant, and fungal intron-containing genes to a 20% similarity level and then identify the most similar animal–plant, animal–fungal, and plant–fungal protein pairs. We identify the introns in each BLAST 2.0 alignment and score matched intron positions and slid (near-matched, within six nucleotides) intron positions automatically. Overall we find that 10% of the animal introns match plant positions, and a further 7% are “slides.” Fifteen percent of fungal introns match animal positions, and 13% match plant positions. Furthermore, the number of alignments with high numbers of matches deviates greatly from the Poisson expectation. The 30 animal–plant alignments with the highest matches (for which 44% of animal introns match plant positions) when aligned with fungal genes are also highly enriched for triple matches: 39% of the fungal introns match both animal and plant positions. This is strong evidence for ancestral introns predating the animal–plant–fungal divergence, and in complete opposition to any expectations based on random insertion. In examining the slid introns, we show that at least half are caused by imperfections in the alignments, and are most likely to be actual matches at common positions. Thus, our final estimates are that ≈14% of animal introns match plant positions, and that ≈17–18% of fungal introns match animal or plant positions, all of these being likely to be ancestral in the eukaryotes.

[1]  W. Doolittle,et al.  The chaperonin genes of jakobid and jakobid-like flagellates: implications for eukaryotic evolution. , 2002, Molecular biology and evolution.

[2]  A. Newman,et al.  Evidence that introns arose at proto‐splice sites. , 1989, The EMBO journal.

[3]  Andrew G McArthur,et al.  A spliceosomal intron in Giardia lamblia , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Andrew J. Roger,et al.  U2 and U6 snRNA genes in the microsporidian Nosema locustae: evidence for a functional spliceosome , 1998, Nucleic Acids Res..

[5]  J. Logsdon,et al.  The recent origins of spliceosomal introns revisited. , 1998, Current opinion in genetics & development.

[6]  W. Doolittle,et al.  Trichomonas vaginalis possesses a gene encoding the essential spliceosomal component, PRP8. , 1999, Molecular and biochemical parasitology.

[7]  Walter Gilbert,et al.  The triosephosphate isomerase gene from maize introns antedate the plant-animal divergence , 1986, Cell.

[8]  D. Shah,et al.  Genes encoding actin in higher plants: intron positions are highly conserved but the coding sequences are not. , 1983, Journal of molecular and applied genetics.

[9]  M. Long,et al.  Testing the "proto-splice sites" model of intron origin: evidence from analysis of intron phase correlations. , 2000, Molecular biology and evolution.

[10]  W. Ford Doolittle,et al.  Genes in pieces: were they ever together? , 1978, Nature.

[11]  W. Gilbert,et al.  Centripetal modules and ancient introns. , 1999, Gene.

[12]  Fabienne Thomarat,et al.  Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi , 2001, Nature.

[13]  W. Gilbert,et al.  The exon theory of genes. , 1987, Cold Spring Harbor symposia on quantitative biology.

[14]  W. Gilbert Why genes in pieces? , 1978, Nature.

[15]  John M. Logsdon,et al.  The recent origins of introns. , 1991 .

[16]  S J de Souza,et al.  Relationship between "proto-splice sites" and intron phases: evidence from dicodon analysis. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[17]  T. Cavalier-smith,et al.  Selfish DNA and the origin of introns , 1985, Nature.

[18]  S J de Souza,et al.  Toward a resolution of the introns early/late debate: only phase zero introns are correlated with the structure of ancient proteins. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[19]  J D Palmer,et al.  Seven newly discovered intron positions in the triose-phosphate isomerase gene: evidence for the introns-late theory. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[20]  W. Gilbert,et al.  Do introns favor or avoid regions of amino acid conservation? , 2002, Molecular biology and evolution.

[21]  J D Palmer,et al.  Intron "sliding" and the diversity of intron positions. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Walter Gilbert,et al.  On the antiquity of introns , 1986, Cell.

[23]  Iraj Daizadeh,et al.  EID: the Exon?Intron Database?an exhaustive database of protein-coding intron-containing genes , 2000, Nucleic Acids Res..

[24]  S. Karlin,et al.  Dinucleotide relative abundance extremes: a genomic signature. , 1995, Trends in genetics : TIG.

[25]  S J de Souza,et al.  Intron distribution difference for 276 ancient and 131 modern genes suggests the existence of ancient introns , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[26]  W. Martin,et al.  Five identical intron positions in ancient duplicated genes of eubacterial origin , 1994, Nature.