E-CAI: a novel server to estimate an expected value of Codon Adaptation Index (eCAI)

BackgroundThe Codon Adaptation Index (CAI) is a measure of the synonymous codon usage bias for a DNA or RNA sequence. It quantifies the similarity between the synonymous codon usage of a gene and the synonymous codon frequency of a reference set. Extreme values in the nucleotide or in the amino acid composition have a large impact on differential preference for synonymous codons. It is thence essential to define the limits for the expected value of CAI on the basis of sequence composition in order to properly interpret the CAI and provide statistical support to CAI analyses. Though several freely available programs calculate the CAI for a given DNA sequence, none of them corrects for compositional biases or provides confidence intervals for CAI values.ResultsThe E-CAI server, available at http://genomes.urv.es/CAIcal/E-CAI, is a web-application that calculates an expected value of CAI for a set of query sequences by generating random sequences with G+C and amino acid content similar to those of the input. An executable file, a tutorial, a Frequently Asked Questions (FAQ) section and several examples are also available. To exemplify the use of the E-CAI server, we have analysed the codon adaptation of human mitochondrial genes that codify a subunit of the mitochondrial respiratory chain (excluding those genes that lack a prokaryotic orthologue) and are encoded in the nuclear genome. It is assumed that these genes were transferred from the proto-mitochondrial to the nuclear genome and that its codon usage was then ameliorated.ConclusionThe E-CAI server provides a direct threshold value for discerning whether the differences in CAI are statistically significant or whether they are merely artifacts that arise from internal biases in the G+C composition and/or amino acid composition of the query sequences.

[1]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[2]  Yizhar Lavner,et al.  The action of selection on codon bias in the human genome is related to frequency, complexity, and chronology of amino acids , 2006, BMC Genomics.

[3]  C. Wilke,et al.  A single determinant dominates the rate of yeast protein evolution. , 2006, Molecular biology and evolution.

[4]  Santiago Garcia-Vallvé,et al.  Working toward a new NIOSH. , 1996, Nucleic Acids Res..

[5]  Alessandra Carbone,et al.  Codon adaptation index as a measure of dominating codon bias , 2003, Bioinform..

[6]  A. Fuglsang,et al.  Correlation of codon bias measures with mRNA levels: analysis of transcriptome data from Escherichia coli. , 2005, Biochemical and biophysical research communications.

[7]  Luis A. Escobar,et al.  Statistical Intervals: A Guide for Practitioners , 1991 .

[8]  J. McInerney,et al.  The causes of protein evolutionary rate variation. , 2006, Trends in ecology & evolution.

[9]  Weiwen Zhang,et al.  Predicted highly expressed genes in Nocardia farcinica and the implication for its primary metabolism and nocardial virulence , 2006, Antonie van Leeuwenhoek.

[10]  P. Sharp,et al.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. , 1987, Nucleic acids research.

[11]  Yizhar Lavner,et al.  Codon bias as a factor in regulating expression via translation rate in the human genome. , 2005, Gene.

[12]  Emmet A. O'Brien,et al.  GOBASE—a database of organelle and bacterial genome information , 2005, Nucleic Acids Res..

[13]  J. Palmer,et al.  Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. , 2003, Molecular phylogenetics and evolution.

[14]  B. Lang,et al.  The origin and early evolution of mitochondria , 2001, Genome Biology.

[15]  P. Sharp,et al.  Predicting gene expression level from codon usage bias. , 2007, Molecular biology and evolution.

[16]  Weiwen Zhang,et al.  Predicted highly expressed genes in the genomes of Streptomyces coelicolor and Streptomyces avermitilis and the implications for their metabolism. , 2005, Microbiology.

[17]  Xuhua Xia,et al.  An Improved Implementation of Codon Adaptation Index , 2007, Evolutionary bioinformatics online.

[18]  B. Morton Selection on the codon bias of chloroplast and cyanelle genes in different plant and algal lineages , 1998, Journal of Molecular Evolution.

[19]  Toshimichi Ikemura,et al.  Codon usage tabulated from international DNA sequence databases: status for the year 2000 , 2000, Nucleic Acids Res..

[20]  Kristian Vlahovicek,et al.  Comparison of codon usage measures and their applicability in prediction of microbial gene expressivity , 2005, BMC Bioinformatics.

[21]  B. Lang,et al.  Mitochondrial evolution. , 1999, Science.

[22]  B. Lang,et al.  Mitochondrial genomes: anything goes. , 2003, Trends in genetics : TIG.

[23]  H. Ochman,et al.  Molecular archaeology of the Escherichia coli genome. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Yann Ponty,et al.  GenRGenS: software for generating random genomic sequences and structures , 2006, Bioinform..

[25]  Carsten Friis,et al.  An environmental signature for 323 microbial genomes based on codon adaptation indices , 2006, Genome Biology.

[26]  M. Huynen,et al.  Shaping the mitochondrial proteome. , 2004, Biochimica et biophysica acta.

[27]  W. Bowen Statistical Intervals: A Guide for Practitioners , 1992 .

[28]  W. Fitch Random sequences. , 1983, Journal of molecular biology.

[29]  S. Garcia-Vallvé,et al.  Horizontal gene transfer in glycosyl hydrolases inferred from codon usage in Escherichia coli and Bacillus subtilis. , 1999, Molecular biology and evolution.

[30]  Santiago Garcia-Vallvé,et al.  HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes , 2003, Nucleic Acids Res..

[31]  P. Sharp,et al.  Synonymous codon usage in Pseudomonas aeruginosa PA01. , 2002, Gene.

[32]  Alberto Pasamontes,et al.  Use of a multi-way method to analyze the amino acid composition of a conserved group of orthologous proteins in prokaryotes , 2006, BMC Bioinformatics.