Analysis of the distribution of functionally relevant rare codons

BackgroundThe substitution of rare codons with more frequent codons is a commonly applied method in heterologous gene expression to increase protein yields. However, in some cases these substitutions lead to a decrease of protein solubility or activity. To predict these functionally relevant rare codons, a method was developed which is based on an analysis of multisequence alignments of homologous protein families.ResultsThe method successfully predicts functionally relevant codons in fatty acid binding protein and chloramphenicol acetyltransferase which had been experimentally determined. However, the analysis of 16 homologous protein families belonging to the α/β hydrolase fold showed that functionally rare codons share no common location in respect to the tertiary and secondary structure.ConclusionA systematic analysis of multisequence alignments of homologous protein families can be used to predict rare codons with a potential impact on protein expression. Our analysis showed that most genes contain at least one putative rare codon rich region. Rare codons located near to those regions should be excluded in an approach of improving protein expression by an exchange of rare codons by more frequent codons.

[1]  C. Ockenhouse,et al.  Effect of Codon Optimization on Expression Levels of a Functionally Folded Malaria Vaccine Candidate in Prokaryotic and Eukaryotic Expression Systems , 2003, Infection and Immunity.

[2]  Toshimichi Ikemura,et al.  Codon usage tabulated from international DNA sequence databases: status for the year 2000 , 2000, Nucleic Acids Res..

[3]  A. Brown,et al.  Protein folding within the cell is influenced by controlled rates of polypeptide elongation. , 1992, Journal of molecular biology.

[4]  W. Stallings,et al.  High-level production of active HIV-1 protease in Escherichia coli. , 1992, Gene.

[5]  P. Christen,et al.  Differential effects of molecular chaperones on refolding of homologous proteins , 1995, FEBS letters.

[6]  Jürgen Pleiss,et al.  The Lipase Engineering Database: a navigation and analysis tool for protein families , 2003, Nucleic Acids Res..

[7]  P Argos,et al.  Protein secondary structural types are differentially coded on messenger RNA , 1996, Protein science : a publication of the Protein Society.

[8]  Henry A. Lester,et al.  Codon optimization of Caenorhabditis elegans GluCl ion channel genes for mammalian cells dramatically improves expression levels , 2003, Journal of Neuroscience Methods.

[9]  S. Pedersen Escherichia coli ribosomes translate in vivo with variable rate. , 1984, The EMBO journal.

[10]  G. Gonnet,et al.  Exhaustive matching of the entire protein sequence database. , 1992, Science.

[11]  P Argos,et al.  Ribosome‐mediated translational pause and protein domain organization , 1996, Protein science : a publication of the Protein Society.

[12]  Zhiyong Zhou,et al.  Enhanced expression of a recombinant malaria candidate vaccine in Escherichia coli by codon optimization. , 2004, Protein expression and purification.

[13]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[14]  Edward N Trifonov,et al.  Distribution of Rare Triplets Along mRNA and Their Relation to Protein Folding , 2002, Journal of biomolecular structure & dynamics.

[15]  A. Komar,et al.  Synonymous codon substitutions affect ribosome traffic and protein folding during in vitro translation , 1999, FEBS letters.

[16]  T. Ikemura Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. , 1981, Journal of molecular biology.

[17]  R. Lloubès,et al.  Translation is a non-uniform process. Effect of tRNA availability on the rate of elongation of nascent polypeptide chains. , 1984, Journal of molecular biology.

[18]  P. Sharp,et al.  Variation in the strength of selected codon usage bias among bacteria , 2005, Nucleic acids research.

[19]  E. Cota,et al.  Folding studies of immunoglobulin-like beta-sandwich proteins suggest that they share a common folding pathway. , 1999, Structure.

[20]  Ricardo Ehrlich,et al.  Silent mutations affect in vivo protein folding in Escherichia coli. , 2002, Biochemical and biophysical research communications.

[21]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[22]  E. Goldman,et al.  Clustering of low usage codons and ribosome movement. , 1994, Journal of theoretical biology.

[23]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[24]  T. Chou,et al.  Clustered bottlenecks in mRNA translation and protein synthesis. , 2003, Physical review letters.

[25]  L. Wernisch,et al.  Solving the riddle of codon usage preferences: a test for translational selection. , 2004, Nucleic acids research.

[26]  L. Banaszak,et al.  Properties and Crystal Structure of a -Barrel Folding Mutant , 2000 .

[27]  F M Poulsen,et al.  Fast and one-step folding of closely and distantly related homologous proteins of a four-helix bundle family. , 1996, Journal of molecular biology.

[28]  N. Fairweather,et al.  Expression of tetanus toxin fragment C in E. coli: high level expression by removing rare codons. , 1989, Nucleic acids research.