Éclair—a web service for unravelling species origin of sequences sampled from mixed host interfaces

The identification of the genes that participate at the biological interface of two species remains critical to our understanding of the mechanisms of disease resistance, disease susceptibility and symbiosis. The sequencing of complementary DNA (cDNA) libraries prepared from the biological interface between two organisms provides an inexpensive way to identify the novel genes that may be expressed as a cause or consequence of compatible or incompatible interactions. Sequence classification and annotation of species origin typically use an orthology-based approach and require access to large portions of either genome, or a close relative. Novel species- or clade-specific sequences may have no counterpart within existing databases and remain ambiguous features. Here we present a web-service, Éclair, which utilizes support vector machines for the classification of the origin of expressed sequence tags stemming from mixed host cDNA libraries. In addition to providing an interface for the classification of sequences, users are presented with the opportunity to train a model to suit their preferred species pair. Éclair is freely available at .

[1]  Igor V. Tetko,et al.  Support vector machines for separation of mixed plant?Cpathogen EST collections based on codon usage , 2005, Bioinform..

[2]  Stephen Rudd,et al.  openSputnik—a database to ESTablish comparative plant genomics using unsaturated sequence collections , 2004, Nucleic Acids Res..

[3]  E. Loker,et al.  IDENTIFICATION OF TRANSCRIPTS GENERATED DURING THE RESPONSE OF RESISTANT BIOMPHALARIA GLABRATA TO SCHISTOSOMA MANSONI INFECTION USING SUPPRESSION SUBTRACTIVE HYBRIDIZATION , 2004, The Journal of parasitology.

[4]  Hongyu Zhao,et al.  Study of Arabidopsis thalianaresistome in response to cucumber mosaic virus infection using whole genome microarray , 2004, Plant Molecular Biology.

[5]  J. Elkinton,et al.  Pathogenicity and virulence. , 2004, Journal of invertebrate pathology.

[6]  Peter Winter,et al.  Gene expression analysis of plant host–pathogen interactions by SuperSAGE , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Tom Hsiang,et al.  Distinguishing plant and fungal sequences in ESTs from infected plant tissues. , 2003, Journal of microbiological methods.

[8]  S. Rudd Expressed sequence tags: alternative or complement to whole genome sequences? , 2003, Trends in plant science.

[9]  P. Goodwin,et al.  PF-IND: probability algorithm and software for separation of plant and fungal sequences , 2003, Current Genetics.

[10]  G. Thompson,et al.  Gene expression profiling of Arabidopsis thaliana in compatible plant-aphid interactions. , 2002, Archives of insect biochemistry and physiology.

[11]  A. Bent,et al.  Probing plant-pathogen interactions and downstream defense signaling using DNA microarrays , 2002, Functional & Integrative Genomics.

[12]  P. Hraber,et al.  On the species of origin: diagnosing the source of symbiotic transcripts , 2001, Genome Biology.

[13]  T. Stokes Transcriptional responses to plant pathogen interactions. , 2001, Trends in plant science.

[14]  B. Staskawicz Genetics of plant-pathogen interactions specifying plant disease resistance. , 2001, Plant physiology.

[15]  B. Sobral,et al.  Comparative analysis of expressed sequences in Phytophthora sojae. , 2000, Plant physiology.

[16]  C. V. Jongeneel,et al.  ESTScan: A Program for Detecting, Evaluating, and Reconstructing Potential Coding Regions in EST Sequences , 1999, ISMB.

[17]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.