Dense taxonomic EST sampling and its applications for molecular systematics of the Coleoptera (beetles).

Expressed sequence tag (EST) sequences can provide a wealth of data for phylogenetic and genomic studies, but the utility of these resources is restricted by poor taxonomic sampling. Here, we use small EST libraries (<1,000 clones) to generate phylogenetic markers across a broad sample of insects, focusing on the species-rich Coleoptera (beetles). We sequenced over 23,000 ESTs from 34 taxa, which produced 8,728 unique sequences after clustering nonredundant sequences. Between taxa, the sequences could be grouped into 731 gene clusters, with the largest corresponding to mitochondrial DNA transcripts and gene families chymotrypsin, actin, troponin, and tubulin. While levels of paralogy were high in most gene clusters, several midsized clusters including many ribosomal protein (RP) genes appeared to be free of expressed paralogs. To evaluate the utility of EST data for molecular systematics, we curated available transcripts for 66 RP genes from representatives of the major groups of Coleoptera. Using supertree and supermatrix approaches for phylogenetic analysis, the results were consistent with the emerging phylogenetic conclusions about basal relationships in Coleoptera. Numerous small EST libraries from a taxonomically densely sampled lineage can provide a core set of genes that together act as a scaffold in phylogenetic reconstruction, comparative genomics, and studies of gene evolution.

[1]  Mark L. Blaxter,et al.  PartiGene-constructing partial genomes , 2004, Bioinform..

[2]  J. Dopazo,et al.  Genome-scale evidence of the nematode-arthropod clade , 2005, Genome Biology.

[3]  B. Groombridge Global biodiversity: status of the earth's living resources. , 1992 .

[4]  James M. Carpenter,et al.  The Phylogeny of the Extant Hexapod Orders , 2001, Cladistics : the international journal of the Willi Hennig Society.

[5]  John F. Lawrence,et al.  EVOLUTION OF THE HIND WING IN COLEOPTERA , 1993, Canadian Entomologist.

[6]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.

[7]  The Evolutionary Position , 1883, Nature.

[8]  John P. Huelsenbeck,et al.  MRBAYES: Bayesian inference of phylogenetic trees , 2001, Bioinform..

[9]  Peer Bork,et al.  Comparative Genome and Proteome Analysis of Anopheles gambiae and Drosophila melanogaster , 2002, Science.

[10]  R. Raff,et al.  Evidence for a clade of nematodes, arthropods and other moulting animals , 1997, Nature.

[11]  G. Stephanopoulos,et al.  A compendium of gene expression in normal human tissues. , 2001, Physiological genomics.

[12]  Mark L. Blaxter,et al.  SimiTri-visualizing similarity relationships for groups of sequences , 2003, Bioinform..

[13]  W. Murphy,et al.  Resolution of the Early Placental Mammal Radiation Using Bayesian Phylogenetics , 2001, Science.

[14]  M. Milinkovitch,et al.  Stability of cladistic relationships between Cetacea and higher-level artiodactyl taxa. , 1999, Systematic biology.

[15]  James O. McInerney,et al.  Clann: investigating phylogenetic information through supertree analyses , 2005, Bioinform..

[16]  A. Vogler,et al.  The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data. , 2004, Molecular phylogenetics and evolution.

[17]  R. A. Crowson The Phylogeny of Coleoptera , 1960 .

[18]  K Theodorides,et al.  Comparison of EST libraries from seven beetle species: towards a framework for phylogenomics of the Coleoptera , 2002, Insect molecular biology.

[19]  T. Hunt,et al.  On the constitution and phylogeny of Staphyliniformia (Insecta: Coleoptera). , 2005, Molecular phylogenetics and evolution.

[20]  J. Thompson,et al.  The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. , 1997, Nucleic acids research.

[21]  Richard R. Copley,et al.  Animal Phylogeny: Fatal Attraction , 2005, Current Biology.

[22]  Jessica C Kissinger,et al.  Gene discovery in the apicomplexa as revealed by EST sequencing and assembly of a comparative gene database. , 2003, Genome research.

[23]  Wei Qian,et al.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. , 2000, Molecular biology and evolution.

[24]  A. Vogler,et al.  Basal relationships of Coleoptera inferred from 18S rDNA sequences , 2002 .

[25]  Toru Shimada,et al.  Annotation pattern of ESTs from Spodoptera frugiperda Sf9 cells and analysis of the ribosomal protein genes reveal insect-specific features and unexpectedly low codon usage bias , 2003, Bioinform..

[26]  J. G. Burleigh,et al.  Prospects for Building the Tree of Life from Large Sequence Databases , 2004, Science.

[27]  M. Ragan,et al.  Matrix representation in reconstructing phylogenetic relationships among the eukaryotes. , 1992, Bio Systems.

[28]  R. Crowson Insect Phylogeny , 1970, Nature.

[29]  S. Rudd Expressed sequence tags: alternative or complement to whole genome sequences? , 2003, Trends in plant science.

[30]  H. Philippe,et al.  Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. , 2005, Molecular biology and evolution.

[31]  R. A. Crowson The natural classification of the families of coleoptera , 1955 .

[32]  F. Haas,et al.  Phylogenetic Relationships of the Suborders of Coleoptera (Insecta) , 2000 .

[33]  T. Shimada,et al.  The construction of an EST database for Bombyx mori and its application , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Terry Gaasterland,et al.  The analysis of 100 genes supports the grouping of three highly divergent amoebae: Dictyostelium, Entamoeba, and Mastigamoeba , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[35]  T. Gojobori,et al.  Bmc Evolutionary Biology the Evolutionary Position of Nematodes , 2022 .

[36]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[37]  F. Sperling,et al.  The current state of insect molecular systematics: a thriving Tower of Babel. , 2000, Annual review of entomology.

[38]  J. Wiens,et al.  Missing data, incomplete taxa, and phylogenetic accuracy. , 2003, Systematic biology.

[39]  John Parkinson,et al.  Expressed sequence tags: analysis and annotation. , 2004, Methods in molecular biology.

[40]  P. Holland,et al.  Phylogenomics of eukaryotes: impact of missing data on large alignments. , 2004, Molecular biology and evolution.

[41]  R. Beutel,et al.  Interrelationships of Staphyliniform groups inferred from 18S and 28S rDNA sequences, with special emphasis on Hydrophiloidea (Coleoptera, Staphyliniformia) , 2004 .

[42]  Mark L. Blaxter,et al.  Making sense of EST sequences by CLOBBing them , 2002, BMC Bioinformatics.

[43]  J. McInerney,et al.  The Opisthokonta and the Ecdysozoa may not be clades: stronger support for the grouping of plant and animal than for animal and fungi and stronger support for the Coelomata than Ecdysozoa. , 2005, Molecular biology and evolution.

[44]  S Blair Hedges,et al.  BMC Evolutionary Biology BioMed Central , 2003 .

[45]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[46]  A. Vogler,et al.  Using exon and intron sequences of the gene Mp20 to resolve basal relationships in Cicindela (Coleoptera:Cicindelidae). , 2004, Systematic biology.

[47]  Naiara Rodríguez-Ezpeleta,et al.  Monophyly of Primary Photosynthetic Eukaryotes: Green Plants, Red Algae, and Glaucophytes , 2005, Current Biology.

[48]  G M Rubin,et al.  A Drosophila complementary DNA resource. , 2000, Science.

[49]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[50]  G. Hewitt,et al.  Phylogeny of the Coleoptera based on mitochondrial cytochrome oxidase I sequence data , 1995, Insect molecular biology.

[51]  M. Luniak Muzeum i Instytut Zoologii PAN , 2006 .

[52]  Terry L. Erwin,et al.  TROPICAL FORESTS: THEIR RICHNESS IN COLEOPTERA AND OTHER ARTHROPOD SPECIES , 1982 .

[53]  C. Keeling,et al.  Comparison of gene representation in midguts from two phytophagous insects, Bombyx mori and Ips pini, using expressed sequence tags. , 2003, Gene.

[54]  B. Baum Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees , 1992 .

[55]  Neil Hall,et al.  A transcriptomic analysis of the phylum Nematoda , 2004, Nature Genetics.

[56]  Richard G. Olmstead,et al.  Combining Data in Phylogenetic Systematics: An Empirical Approach Using Three Molecular Data Sets in the Solanaceae , 1994 .

[57]  S. O’Brien,et al.  A Molecular Phylogeny for Bats Illuminates Biogeography and the Fossil Record , 2005, Science.