Predicting protein thermostability changes from sequence upon multiple mutations

Motivation: A basic question in protein science is to which extent mutations affect protein thermostability. This knowledge would be particularly relevant for engineering thermostable enzymes. In several experimental approaches, this issue has been serendipitously addressed. It would be therefore convenient providing a computational method that predicts when a given protein mutant is more thermostable than its corresponding wild-type. Results: We present a new method based on support vector machines that is able to predict whether a set of mutations (including insertion and deletions) can enhance the thermostability of a given protein sequence. When trained and tested on a redundancy-reduced dataset, our predictor achieves 88% accuracy and a correlation coefficient equal to 0.75. Our predictor also correctly classifies 12 out of 14 experimentally characterized protein mutants with enhanced thermostability. Finally, it correctly detects all the 11 mutated proteins whose increase in stability temperature is >10°C. Availability: The dataset and the list of protein clusters adopted for the SVM cross-validation are available at the web site http://lipid.biocomp.unibo.it/~ludovica/thermo-meso-MUT. Contact: casadio@alma.unibo.it

[1]  David P. Kreil,et al.  Identification of thermophilic species by the amino acid compositions deduced from their genomes. , 2001, Nucleic acids research.

[2]  X. Lei,et al.  Cumulative improvements of thermostability and pH-activity profile of Aspergillus niger PhyA phytase by site-directed mutagenesis , 2008, Applied Microbiology and Biotechnology.

[3]  Huimin Zhao,et al.  Further improvement of phosphite dehydrogenase thermostability by saturation mutagenesis , 2008, Biotechnology and bioengineering.

[4]  Jean-Michel Claverie,et al.  Genomic Correlates of Hyperthermostability, an Update* , 2003, The Journal of Biological Chemistry.

[5]  D. Lynn,et al.  Synonymous codon usage is subject to selection in thermophilic bacteria. , 2002, Nucleic acids research.

[6]  M. Bonato,et al.  Preferred amino acids and thermostability. , 2003, Genetics and molecular research : GMR.

[7]  Huimin Zhao,et al.  Directed Evolution of a Thermostable Phosphite Dehydrogenase for NAD(P)H Regeneration , 2005, Applied and Environmental Microbiology.

[8]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[9]  Yi Li,et al.  Concurrent mutations in six amino acids in beta-glucuronidase improve its thermostability. , 2007, Protein engineering, design & selection : PEDS.

[10]  Piero Fariselli,et al.  Robust determinants of thermostability highlighted by a codon frequency index capable of discriminating thermophilic from mesophilic genomes. , 2007, Journal of proteome research.

[11]  H. Minagawa,et al.  Improving the thermal stability of lactate oxidase by directed evolution , 2006, Cellular and Molecular Life Sciences.

[12]  J. M. Scholtz,et al.  Lessons in stability from thermophilic proteins , 2006, Protein science : a publication of the Protein Society.

[13]  Karen M Polizzi,et al.  High-throughput screening for enhanced protein stability. , 2006, Current opinion in biotechnology.

[14]  F. Arnold,et al.  Thermostabilization of a Cytochrome P450 Peroxygenase , 2003, Chembiochem : a European journal of chemical biology.

[15]  A. Szilágyi,et al.  Structural differences between mesophilic, moderately thermophilic and extremely thermophilic protein subunits: results of a comprehensive survey. , 2000, Structure.

[16]  Ikuo Uchiyama,et al.  Thermoadaptation trait revealed by the genome sequence of thermophilic Geobacillus kaustophilus. , 2004, Nucleic acids research.

[17]  R. Ward,et al.  Thermostable variants of the recombinant xylanase a from Bacillus subtilis produced by directed evolution show reduced heat capacity changes , 2007, Proteins.

[18]  G. Singer,et al.  Thermophilic prokaryotes have characteristic patterns of codon usage, amino acid composition and nucleotide content. , 2003, Gene.

[19]  Orly Dym,et al.  A single proline substitution is critical for the thermostabilization of Clostridium beijerinckii alcohol dehydrogenase , 2006, Proteins.

[20]  Naoki Kajiyama,et al.  Thermostabilization of porcine kidney D-amino acid oxidase by a single amino acid substitution. , 2006, Biotechnology and bioengineering.

[21]  Dawn Elizabeth Stephens,et al.  Directed evolution of the thermostable xylanase from Thermomyces lanuginosus. , 2007, Journal of biotechnology.

[22]  X.-X. Zhou,et al.  Differences in amino acids composition and coupling patterns between mesophilic and thermophilic proteins , 2007, Amino Acids.

[23]  D. Chessel,et al.  Internal correspondence analysis of codon and amino-acid usage in thermophilic bacteria. , 2003, Journal of applied genetics.

[24]  Baishan Fang,et al.  Support vector machine for discrimination of thermophilic and mesophilic proteins based on amino acid composition. , 2006, Protein and peptide letters.

[25]  F. Arnold,et al.  A diverse family of thermostable cytochrome P450s created by recombination of stabilizing fragments , 2007, Nature Biotechnology.

[26]  Michael Wunderlich,et al.  Optimized variants of the cold shock protein from in vitro selection: structural basis of their high thermostability. , 2007, Journal of molecular biology.

[27]  Igor N. Berezovsky,et al.  Protein and DNA Sequence Determinants of Thermophilic Adaptation , 2006, PLoS Comput. Biol..

[28]  Guangya Zhang,et al.  [A study on the discrimination of thermophilic and mesophilic proteins based on dipeptide composition]. , 2006, Sheng wu gong cheng xue bao = Chinese journal of biotechnology.

[29]  Anders Krogh,et al.  Learning with ensembles: How overfitting can be useful , 1995, NIPS.

[30]  Seung Pil Pack,et al.  Thermostabilization of Pichia stipitis xylitol dehydrogenase by mutation of structural zinc-binding loop. , 2007, Journal of biotechnology.

[31]  Frances H. Arnold,et al.  Thermostabilization of a Cytochrome P 450 Peroxygenase , 2003 .

[32]  Naoki Kajiyama,et al.  Thermostabilization of Bacterial Fructosyl-Amino Acid Oxidase by Directed Evolution , 2003, Applied and Environmental Microbiology.

[33]  Dietmar Schomburg,et al.  Prediction of protein thermostability with a direction‐ and distance‐dependent knowledge‐based potential , 2005, Protein science : a publication of the Protein Society.

[34]  Takayuki Hoshino,et al.  In vivo directed evolution for thermostabilization of Escherichia coli hygromycin B phosphotransferase and the use of the gene as a selection marker in the host-vector system of Thermus thermophilus. , 2005, Journal of bioscience and bioengineering.

[35]  Ricardo Cavicchioli,et al.  Improved thermal stability and activity in the cold-adapted lipase B from Candida antarctica following chemical modification with oxidized polysaccharides , 2005, Extremophiles.

[36]  Geoffrey K. Hom,et al.  Full-sequence computational design and solution structure of a thermostable protein variant. , 2007, Journal of molecular biology.

[37]  Andrew P. Turnbull,et al.  Engineering a Selectable Marker for Hyperthermophiles* , 2005, Journal of Biological Chemistry.

[38]  J. Lobry,et al.  Synonymous codon usage and its potential link with optimal growth temperature in prokaryotes. , 2006, Gene.