Learning structural bioinformatics and evolution with a snake puzzle

Wepropose here a working unit for teaching basic concepts of structural bioinformatics and evolution through the example of a wooden snake puzzle, strikingly similar to toy models widely used in the literature of protein folding. In our experience, developed at a Master’s course at the Universidad Autónoma deMadrid (Spain), the concreteness of this example helps to overcome difficulties caused by the interdisciplinary nature of this field and its high level of abstraction, in particular for students coming from traditional disciplines. The puzzle will allow us discussing a simple algorithm for finding folded solutions, through which we will introduce the concept of the configuration space and the contact matrix representation. This is a central tool for comparing protein structures, for studying simple models of protein energetics, and even for a qualitative discussion of folding kinetics, through the concept of the Contact Order. It also allows a simple representation ofmisfolded conformations and their free energy. These concepts will motivate evolutionary questions, which we will address by simulating a structurally constrainedmodel of protein evolution, againmodelled on the snake puzzle. In thisway, we can discuss the analogy between evolutionary concepts and statisticalmechanics that facilitates the understanding of both concepts. The proposed examples and literature are accessible, andwe provide supplementarymaterial (see ‘Data Availability’) to reproduce the numerical experiments. We also suggest possible directions to expand the unit. We hope that this work will further stimulate the adoption of games in teaching practice. Subjects Bioinformatics, Computational Biology, Computer Education, Scientific Computing and Simulation

[1]  Markus Porto,et al.  Detecting selection for negative design in proteins through an improved model of the misfolded state , 2013, Proteins.

[2]  P. Wolynes,et al.  Spin glasses and the statistical mechanics of protein folding. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[3]  M. Karplus,et al.  Kinetics of protein folding , 1995, Nature.

[4]  Per Jambeck,et al.  Developing Bioinformatics Computer Skills , 2001 .

[5]  U. Bastolla,et al.  Maximum-Likelihood Phylogenetic Inference with Selection on Protein Folding Stability. , 2015, Molecular biology and evolution.

[6]  Michal Brylinski,et al.  The continuity of protein structure space is an intrinsic property of proteins , 2009, Proceedings of the National Academy of Sciences.

[7]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[8]  Rafael C. Jimenez,et al.  Teaching the Fundamentals of Biological Data Integration Using Classroom Games , 2012, PLoS Comput. Biol..

[9]  W. Delano The PyMOL Molecular Graphics System (2002) , 2002 .

[10]  V. Pande,et al.  Enumerations of the Hamiltonian walks on a cubic sublattice , 1994 .

[11]  Ugo Bastolla,et al.  Quantifying the evolutionary divergence of protein structures: The role of function change and function conservation , 2010, Proteins.

[12]  I. Bahar,et al.  Coarse-grained normal mode analysis in structural biology. , 2005, Current opinion in structural biology.

[13]  Claus O. Wilke,et al.  Bringing Molecules Back into Molecular Evolution , 2012, PLoS Comput. Biol..

[14]  C. Levinthal How to fold graciously , 1969 .

[15]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[16]  Eugene I Shakhnovich,et al.  Expanding protein universe and its origin from the biological Big Bang , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Raul Mendez,et al.  Mutation Bias Favors Protein Folding Stability in the Evolution of Small Populations , 2010, PLoS Comput. Biol..

[18]  Ron Unger,et al.  Trade-off between Positive and Negative Design of Protein Stability: From Lattice Models to Real Proteins , 2009, PLoS Comput. Biol..

[19]  D. Goodsell,et al.  Visualization of macromolecular structures , 2010, Nature Methods.

[20]  Charles Ofria,et al.  Evolving Digital Ecological Networks , 2013, PLoS Comput. Biol..

[21]  Y. Lazebnik Can a biologist fix a radio? — or, what I learned while studying apoptosis , 2004, Biochemistry (Moscow).

[22]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.

[23]  D. M. Taverna,et al.  Why are proteins marginally stable? , 2002, Proteins.

[24]  M. Karplus,et al.  Kinetics of protein folding. A lattice model study of the requirements for folding to the native state. , 1994, Journal of molecular biology.

[25]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[26]  B. Derrida Random-energy model: An exactly solvable model of disordered systems , 1981 .

[27]  U. Bastolla,et al.  Testing similarity measures with continuous and discrete protein models , 2002, Proteins.

[28]  Raul Andino,et al.  The role of mutational robustness in RNA virus evolution , 2013, Nature Reviews Microbiology.

[29]  V. Pande,et al.  On the application of statistical physics to evolutionary biology. , 2009, Journal of theoretical biology.

[30]  B. F. Francis Ouellette,et al.  Education in Computational Biology Today and Tomorrow , 2013, PLoS Comput. Biol..

[31]  A. Elofsson,et al.  Structure is three to ten times more conserved than sequence—A study of structural response in protein cores , 2009, Proteins.

[32]  R. Goldstein,et al.  The evolution and evolutionary consequences of marginal thermostability in proteins , 2011, Proteins.

[33]  N. Go,et al.  Studies on protein folding, unfolding and fluctuations by computer simulation. I. The effect of specific amino acid sequence represented by specific inter-unit interactions. , 2009 .

[34]  Angel R. Ortiz,et al.  Cross-Over between Discrete and Continuous Protein Structure Space: Insights into Automatic Classification and Networks of Protein Structures , 2009, PLoS Comput. Biol..

[35]  Igor N. Berezovsky,et al.  Positive and Negative Design in Stability and Thermal Adaptation of Natural Proteins , 2006, PLoS Comput. Biol..

[36]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[37]  Chris Sander,et al.  Dali/FSSP classification of three-dimensional protein folds , 1997, Nucleic Acids Res..

[38]  N. Wingreen,et al.  Emergence of Preferred Structures in a Simple Model of Protein Folding , 1996, Science.

[39]  André Bellemans,et al.  Self-avoiding walks on the simple cubic lattice , 1973 .

[40]  David B. Searls,et al.  An Online Bioinformatics Curriculum , 2012, PLoS Comput. Biol..

[41]  Erik D. Demaine,et al.  Finding a Hamiltonian Path in a Cube with Specified Turns is Hard , 2013, J. Inf. Process..

[42]  Eugene I. Shakhnovich,et al.  Enumeration of all compact conformations of copolymers with random sequence of links , 1990 .

[43]  K Schulten,et al.  VMD: visual molecular dynamics. , 1996, Journal of molecular graphics.

[44]  Edward R. Dougherty,et al.  EPISTEMOLOGY OF COMPUTATIONAL BIOLOGY: MATHEMATICAL MODELS AND EXPERIMENTAL PREDICTION AS THE BASIS OF THEIR VALIDITY , 2006 .

[45]  Alpan Raval,et al.  Evolution favors protein mutational robustness in sufficiently large populations , 2007 .

[46]  M. Karplus,et al.  How does a protein fold? , 1994, Nature.

[47]  C. Wilke,et al.  The evolutionary consequences of erroneous protein synthesis , 2009, Nature Reviews Genetics.

[48]  Lucy J. Colwell,et al.  The interface of protein structure, protein biophysics, and molecular evolution , 2012, Protein science : a publication of the Protein Society.

[49]  Kevin W Plaxco,et al.  Contact order revisited: Influence of protein size on the folding rate , 2003, Protein science : a publication of the Protein Society.

[50]  Nick V Grishin,et al.  Discrete-continuous duality of protein structure space. , 2009, Current opinion in structural biology.

[51]  Tirion,et al.  Large Amplitude Elastic Motions in Proteins from a Single-Parameter, Atomic Analysis. , 1996, Physical review letters.

[52]  Adrian W. R. Serohijos,et al.  Merging molecular mechanism and evolution: theory and computation at the interface of biophysics and evolutionary population genetics. , 2014, Current opinion in structural biology.

[53]  Ugo Bastolla,et al.  Computing protein dynamics from protein structure with elastic network models , 2014 .

[54]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[55]  A. Ortiz,et al.  Effective connectivity profile: A structural representation that evidences the relationship between protein structures and sequences , 2008, Proteins.

[56]  Dan S. Tawfik,et al.  Mutational effects and the evolution of new protein functions , 2010, Nature Reviews Genetics.

[57]  Y. Lazebnik Can a biologist fix a radio? — or, what i learned while studying apoptosis , 2004, Biochemistry (Moscow).

[58]  Richard C. Wilson,et al.  Flexible structural protein alignment by a sequence of local transformations , 2009, Bioinform..

[59]  Adrien Treuille,et al.  Predicting protein structures with a multiplayer online game , 2010, Nature.

[60]  Hue Sun Chan,et al.  Cooperativity, local-nonlocal coupling, and nonnative interactions: principles of protein folding from coarse-grained models. , 2011, Annual review of physical chemistry.

[61]  Michele Vendruscolo,et al.  Prediction of site-specific amino acid distributions and limits of divergent evolutionary changes in protein sequences. , 2004, Molecular biology and evolution.

[62]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[63]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[64]  M Karplus,et al.  Configurational entropy of native proteins. , 1987, Biophysical journal.

[65]  William Coon,et al.  A First Attempt to Bring Computational Biology into Advanced High School Biology Classrooms , 2011, PLoS Comput. Biol..