Protein preliminaries and structure prediction fundamentals for computer scientists

Protein structure prediction is a challenging and unsolved problem in computer science. Proteins are the sequence of amino acids connected together by single peptide bond. The combinations of the twenty primary amino acids are the constituents of all proteins. In-vitro laboratory methods used in this problem are very time-consuming, cost-intensive, and failure-prone. Thus, alternative computational methods come into play. The protein structure prediction problem is to find the three-dimensional native structure of a protein, from its amino acid sequence. The native structure of a protein has the minimum free energy possible and arguably determines the function of the protein. In this study, we present the preliminaries of proteins and their structures, protein structure prediction problem, and protein models. We also give a brief overview on experimental and computational methods used in protein structure prediction. This study will provide a fundamental knowledge to the computer scientists who are intending to pursue their future research on protein structure prediction problem.

[1]  Hauke Lilie Designer proteins in biotechnology , 2003, EMBO reports.

[2]  G. Yarrington Molecular Cell Biology , 1987, The Yale Journal of Biology and Medicine.

[3]  Ram Samudrala,et al.  A Combined Approach for Ab Initio Construction of Low Resolution Protein Tertiary Structures from Sequence , 1999, Pacific Symposium on Biocomputing.

[4]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[5]  R Unger,et al.  Genetic algorithms for protein folding simulations. , 1992, Journal of molecular biology.

[6]  Guillermo Sapiro,et al.  Protein secondary structure determination by constrained single-particle cryo-electron tomography. , 2012, Structure.

[7]  Sriram Subramaniam,et al.  Cryo‐electron microscopy – a primer for the non‐microscopist , 2013, The FEBS journal.

[8]  E. W. Meijer,et al.  Self‐Complementarity Achieved through Quadruple Hydrogen Bonding , 1998 .

[9]  Federico Fogolari,et al.  Amino acid empirical contact energy definitions for fold recognition in the space of contact maps , 2003, BMC Bioinformatics.

[10]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[11]  G. Crippen Prediction of protein folding from amino acid sequence over discrete conformation spaces. , 1991, Biochemistry.

[12]  Thomas Lengauer,et al.  Structure Based Drug Design , 2005 .

[13]  Klaus Mueller,et al.  Efficient LBM Visual Simulation on Face-Centered Cubic Lattices , 2009, IEEE Transactions on Visualization and Computer Graphics.

[14]  Erich Bornberg-Bauer,et al.  Chain growth algorithms for HP-type lattice proteins , 1997, RECOMB '97.

[15]  Oliver D. King,et al.  The tip of the iceberg: RNA-binding proteins with prion-like domains in neurodegenerative disease , 2012, Brain Research.

[16]  C. Dobson Protein folding and misfolding , 2003, Nature.

[17]  Mark Gerstein,et al.  Chemistry Nobel Rich in Structure , 2007, Science.

[18]  Adam Smith Protein misfolding , 2003, Nature.

[19]  Mao Chen,et al.  Heuristic algorithm for off-lattice protein folding problem , 2006, Journal of Zhejiang University SCIENCE B.

[20]  D. Hilvert,et al.  3D structural information as a guide to protein engineering using genetic selection. , 1997, Current opinion in structural biology.

[21]  Mathias Jucker,et al.  Self-propagation of pathogenic protein aggregates in neurodegenerative diseases , 2013, Nature.

[22]  Abdul Sattar,et al.  Genetic algorithm feature-based resampling for protein structure prediction , 2010, IEEE Congress on Evolutionary Computation.

[23]  David Baker,et al.  Ab initio methods. , 2003, Methods of biochemical analysis.

[24]  F. Crick Central Dogma of Molecular Biology , 1970, Nature.

[25]  T. Hales The Kepler conjecture , 1998, math/9811078.

[26]  Yang Zhang,et al.  The protein structure prediction problem could be solved using the current PDB library. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Alessandro Dal Palù,et al.  Constraint Logic Programming approach to protein structure prediction , 2004, BMC Bioinformatics.

[28]  D. Nelson,et al.  Lehninger Principles of Biochemistry (5th edition) , 2008 .

[29]  Reinhard Sterner,et al.  Protein design at the crossroads of biotechnology, chemistry, theory, and evolution. , 2003, Angewandte Chemie.

[30]  C. Levinthal Are there pathways for protein folding , 1968 .

[31]  S Banu Ozkan,et al.  The protein folding problem: when will it be solved? , 2007, Current opinion in structural biology.

[32]  Joe Marks,et al.  New heuristic and interactive approaches to 2D rectangular strip packing , 2005, JEAL.

[33]  N. Wingreen,et al.  Emergence of Preferred Structures in a Simple Model of Protein Folding , 1996, Science.

[34]  V. Gold Compendium of chemical terminology , 1987 .

[35]  C. Tanford Protein denaturation. , 1968, Advances in protein chemistry.

[36]  Osmar Norberto de Souza,et al.  Protein Structure, Modelling and Applications , 2007 .

[37]  Tom Creighton Importance of Protein Folding , 2008 .

[38]  M. Lewis,et al.  Calculation of the free energy of association for protein complexes , 1992, Protein science : a publication of the Protein Society.

[39]  Ting Wang,et al.  3D Protein Structure Prediction with Genetic Tabu Search Algorithm in Off-Lattice AB Model , 2009 .

[40]  K. Dill Theory for the folding and stability of globular proteins. , 1985, Biochemistry.

[41]  D. Yee,et al.  Principles of protein folding — A perspective from simple exact models , 1995, Protein science : a publication of the Protein Society.

[42]  Abdul Sattar,et al.  Genetic Algorithm inAb Initio Protein Structure Prediction Using Low Resolution Model: A Review , 2009, Biomedical Data and Applications.

[43]  Junwen Wang,et al.  Predictive models for protein crystallization. , 2004, Methods.

[44]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[45]  Yi Lu,et al.  Protein Structure Design and Engineering , 2011 .

[46]  Richard Bonneau,et al.  Ab initio protein structure prediction: progress and prospects. , 2001, Annual review of biophysics and biomolecular structure.

[47]  Head-Gordon,et al.  Toy model for protein folding. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[48]  C. Anfinsen,et al.  The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. , 1961, Proceedings of the National Academy of Sciences of the United States of America.

[49]  D. Baker,et al.  Prediction and design of macromolecular structures and interactions , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[50]  M. Karplus,et al.  Effective energy functions for protein structure prediction. , 2000, Current opinion in structural biology.

[51]  K. Dill,et al.  From Levinthal to pathways to funnels , 1997, Nature Structural Biology.

[52]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[53]  Ron Unger,et al.  Why genetic algorithms are suitable for protein folding analysis: the theoretical foundations , 1992 .

[54]  K. Wüthrich The way to NMR structures of proteins , 2001, Nature Structural Biology.

[55]  K. Dill,et al.  The protein folding problem. , 1993, Annual review of biophysics.

[56]  Abdul Sattar,et al.  Protein folding prediction in 3D FCC HP lattice model using genetic algorithm , 2007, 2007 IEEE Congress on Evolutionary Computation.

[57]  C. Dobson,et al.  Protein misfolding, functional amyloid, and human disease. , 2006, Annual review of biochemistry.

[58]  Abdul Sattar,et al.  Extended HP Model for Protein Structure Prediction , 2009, J. Comput. Biol..

[59]  Ron Unger,et al.  On the applicability of genetic algorithms to protein folding , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[60]  H. Kröger,et al.  [Protein synthesis]. , 1974, Fortschritte der Medizin.

[61]  Hoque Tamjidul Genetic algorithm for Ab initio protein structure prediction based on low resolution models , 2017 .

[62]  H. Lodish Molecular Cell Biology , 1986 .

[63]  Rolf Backofen,et al.  A Constraint-Based Approach to Fast and Exact Structure Prediction in Three-Dimensional Protein Models , 2006, Constraints.

[64]  So Much More to Know … , 2005, Science.

[65]  Rolf Backofen,et al.  Application of constraint programming techniques for structure prediction of lattice proteins with extended alphabets , 1999, Bioinform..

[66]  Richard A. Goldstein,et al.  Surveying determinants of protein structure designability across different energy models and amino-acid alphabets: A consensus , 2000 .

[67]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[68]  William E. Hart,et al.  Lattice and Off-Lattice Side Chain Models of Protein Folding: Linear Time Structure Prediction Better than 86% of Optimal , 1997, J. Comput. Biol..

[69]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[70]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.