ccSOL omics: a webserver for solubility prediction of endogenous and heterologous expression in Escherichia coli

Summary: Here we introduce ccSOL omics, a webserver for large-scale calculations of protein solubility. Our method allows (i) proteome-wide predictions; (ii) identification of soluble fragments within each sequences; (iii) exhaustive single-point mutation analysis. Results: Using coil/disorder, hydrophobicity, hydrophilicity, β-sheet and α-helix propensities, we built a predictor of protein solubility. Our approach shows an accuracy of 79% on the training set (36 990 Target Track entries). Validation on three independent sets indicates that ccSOL omics discriminates soluble and insoluble proteins with an accuracy of 74% on 31 760 proteins sharing <30% sequence similarity. Availability and implementation: ccSOL omics can be freely accessed on the web at http://s.tartaglialab.com/page/ccsol_group. Documentation and tutorial are available at http://s.tartaglialab.com/static_files/shared/tutorial_ccsol_omics.html. Contact: gian.tartaglia@crg.es Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Michele Vendruscolo,et al.  Sequence-based prediction of protein solubility. , 2012, Journal of molecular biology.

[2]  C. Dobson,et al.  Physicochemical determinants of chaperone requirements. , 2010, Journal of molecular biology.

[3]  A. Cavalli,et al.  The role of aromaticity, exposed surface, and dipole moment in determining protein aggregation rates , 2004, Protein science : a publication of the Protein Society.

[4]  Carmen Maria Livi,et al.  Discovery of protein-RNA networks. , 2014, Molecular bioSystems.

[5]  Francesc X. Avilés,et al.  AGGRESCAN: a server for the prediction and evaluation of "hot spots" of aggregation in polypeptides , 2007, BMC Bioinform..

[6]  S. Thellung,et al.  Role of Prion Protein Aggregation in Neurotoxicity , 2012, International journal of molecular sciences.

[7]  Michele Vendruscolo,et al.  A relationship between mRNA expression levels and protein solubility in E. coli. , 2009, Journal of molecular biology.

[8]  Pierre Baldi,et al.  SOLpro: accurate sequence-based prediction of protein solubility , 2009, Bioinform..

[9]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[10]  Amedeo Caflisch,et al.  Prediction of aggregation rate and aggregation‐prone segments in polypeptide sequences , 2005, Protein science : a publication of the Protein Society.

[11]  B. Seong,et al.  RNA-mediated chaperone type for de novo protein folding , 2009, RNA biology.

[12]  L. Serrano,et al.  Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins , 2004, Nature Biotechnology.

[13]  Dmitrij Frishman,et al.  PROSO II – a new method for protein solubility prediction , 2012, The FEBS journal.

[14]  S. Harcum,et al.  Dynamic transcriptional response of Escherichia coli to inclusion body formation , 2014, Biotechnology and bioengineering.

[15]  Shoji Takada,et al.  Bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of Escherichia coli proteins , 2009, Proceedings of the National Academy of Sciences.

[16]  Tobias Warnecke Loss of the DnaK-DnaJ-GrpE chaperone system among the Aquificales. , 2012, Molecular biology and evolution.

[17]  Federico Agostini,et al.  Predicting protein associations with long noncoding RNAs , 2011, Nature Methods.

[18]  F. Baneyx,et al.  Recombinant protein folding and misfolding in Escherichia coli , 2004, Nature Biotechnology.

[19]  David L. Wilkinson,et al.  Predicting the Solubility of Recombinant Proteins in Escherichia coli , 1991, Bio/Technology.

[20]  Michele Vendruscolo,et al.  Prediction of local structural stabilities of proteins from their amino acid sequences. , 2007, Structure.

[21]  Michele Vendruscolo,et al.  Prediction of aggregation-prone regions in structured proteins. , 2008, Journal of molecular biology.

[22]  John D. Westbrook,et al.  The protein structure initiative structural genomics knowledgebase , 2008, Nucleic Acids Res..

[23]  Carmen Maria Livi,et al.  Principles of self-organization in biological pathways: a hypothesis on the autogenous association of alpha-synuclein , 2013, Nucleic acids research.