Creating a Workflow for Expressed Sequence Tags Analysis

Expressed sequence tags (ESTs) are short sequence fragments of genes and may be used in genomic and genetic investigations. Despite rapid expansion of EST generation process, the resulting sequences are relatively low quality fragments and need to be cleaned before assembling into a larger sequence by identifying overlaps between sample sequences. EST comparative analysis and functional assignment then should be performed to characterize gene annotation and classification, and describe gene functions. In this study we reported the establishment of a workflow for analysis and assembly of ESTs sequences into contigs and singlets and implementation of an EST database. High quality assembled ESTs were annotated using BLASTX through our local BLAST server. We searched several databases including the NCBI non-redundant protein databases. The BLAST results were automatically extracted and transferred into a relational database. We used well annotated Gene Ontology (GO) information to characterize gene function annotation and to classify molecular function, biological processes, and cellular communication. Pathway analysis based on Kyoto Encyclopedia of Genes and Genomes (KEGG) classification has been used for pathway mapping. Enzyme commission (EC) numbers were used to determine which sequences pertained to a specific pathway.

[1]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[2]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[3]  P. Green,et al.  Consed: a graphical tool for sequence finishing. , 1998, Genome research.

[4]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[5]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[6]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[7]  Hui-Hsien Chou,et al.  DNA sequence quality trimming and vector removal , 2001, Bioinform..

[8]  Hae Jin Jeong,et al.  Expressed sequence tags analysis of Blattella germanica. , 2005, The Korean journal of parasitology.

[9]  Marco Marra,et al.  Generation of ESTs in Vitis vinifera wine grape (Cabernet Sauvignon) and table grape (Muscat Hamburg) and discovery of new candidate genes with potential roles in berry development. , 2007, Gene.

[10]  Cecilia Tamborindeguy,et al.  Genomic resources for Myzus persicae: EST sequencing, SNP identification, and microarray design , 2007, BMC Genomics.

[11]  Edward J. Perkins,et al.  Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida , 2007, BMC Bioinformatics.

[12]  T. Buza,et al.  Gene Ontology annotation quality analysis in model eukaryotes , 2008, Nucleic acids research.