Function2Form Bridge—Toward synthetic protein holistic performance prediction

Protein engineering and synthetic biology stand to benefit immensely from recent advances in silico tools for structural and functional analyses of proteins. In the context of designing novel proteins, current in silico tools inform the user on individual parameters of a query protein, with output scores/metrics unique to each parameter. In reality, proteins feature multiple “parts”/functions and modification of a protein aimed at altering a given part, typically has collateral impact on other protein parts. A system for prediction of the combined effect of design parameters on the overall performance of the final protein does not exist. Function2Form Bridge (F2F‐Bridge) attempts to address this by combining the scores of different design parameters pertaining to the protein being analyzed into a single easily interpreted output describing overall performance. The strategy comprises of (a) a mathematical strategy combining data from a myriad of in silico tools into an OP‐score (a singular score informing on a user‐defined overall performance) and (b) the F2F Plot, a graphical means of informing the wetlab biologist holistically on designed construct suitability in the context of multiple parameters, highlighting scope for improvement. F2F predictive output was compared with wetlab data from a range of synthetic proteins designed, built, and tested for this study. Statistical/machine learning approaches for predicting overall performance, for use alongside the F2F plot, were also examined. Comparisons between wetlab performance and F2F predictions demonstrated close and reliable correlations. This user‐friendly strategy represents a pivotal enabler in increasing the accessibility of synthetic protein building and de novo protein design.

[1]  Peter Bühlmann Regression shrinkage and selection via the Lasso: a retrospective (Robert Tibshirani): Comments on the presentation , 2011 .

[2]  David S. Goodsell,et al.  The RCSB protein data bank: integrative view of protein, gene and 3D structural information , 2016, Nucleic Acids Res..

[3]  Andrew R Thomson,et al.  De novo protein design: how do we expand into the universe of possible protein structures? , 2015, Current opinion in structural biology.

[4]  Jing Li,et al.  Transient formation of water-conducting states in membrane transporters , 2013, Proceedings of the National Academy of Sciences.

[5]  D. Baker,et al.  The coming of age of de novo protein design , 2016, Nature.

[6]  Pratyush Tiwary,et al.  Prediction of Protein-Ligand Binding Poses via a Combination of Induced Fit Docking and Metadynamics Simulations. , 2016, Journal of chemical theory and computation.

[7]  Chee Keong Kwoh,et al.  Fast, accurate, and reliable molecular docking with QuickVina 2 , 2015, Bioinform..

[8]  Eric A. Althoff,et al.  De Novo Computational Design of Retro-Aldol Enzymes , 2008, Science.

[9]  Werner Braun,et al.  Exact and efficient analytical calculation of the accessible surface areas and their gradients for macromolecules , 1998, J. Comput. Chem..

[10]  Rob Knight,et al.  Bayesian community-wide culture-independent microbial source tracking , 2011, Nature Methods.

[11]  C A Floudas,et al.  Computational methods for de novo protein design and its applications to the human immunodeficiency virus 1, purine nucleoside phosphorylase, ubiquitin specific protease 7, and histone demethylases. , 2010, Current drug targets.

[12]  A. Ibarra,et al.  ISAMBARD: an open-source computational environment for biomolecular analysis, modelling and design , 2017, Bioinform..

[13]  R D Appel,et al.  Protein identification and analysis tools in the ExPASy server. , 1999, Methods in molecular biology.

[14]  Yang Zhang,et al.  I-TASSER server: new development for protein structure and function predictions , 2015, Nucleic Acids Res..

[15]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[16]  Yang Zhang,et al.  The I-TASSER Suite: protein structure and function prediction , 2014, Nature Methods.

[17]  Peter G Wolynes,et al.  Protein Structure Prediction:  The Next Generation. , 2006, Journal of chemical theory and computation.

[18]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[19]  Ivan Coluzza,et al.  Computational protein design: a review , 2017, Journal of physics. Condensed matter : an Institute of Physics journal.

[20]  Wei-Chiang Shen,et al.  Fusion protein linkers: property, design and functionality. , 2013, Advanced drug delivery reviews.

[21]  Yang Zhang,et al.  An Evolution-Based Approach to De Novo Protein Design and Case Study on Mycobacterium tuberculosis , 2013, PLoS Comput. Biol..

[22]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[23]  J. Ludwig,et al.  grofit: Fitting Biological Growth Curves with R , 2010 .

[24]  David Baker,et al.  A de novo protein binding pair by computational design and directed evolution. , 2011, Molecular cell.

[25]  J. Mccammon,et al.  Accelerated molecular dynamics simulations of ligand binding to a muscarinic G-protein-coupled receptor , 2015, Quarterly Reviews of Biophysics.

[26]  Mark Tangney,et al.  Synthetic Biology in the Driving Seat of the Bioeconomy. , 2017, Trends in biotechnology.

[27]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[28]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[29]  Christoph Adami,et al.  Thermodynamic prediction of protein neutrality. , 2004, Proceedings of the National Academy of Sciences of the United States of America.