PyRosetta Jupyter Notebooks Teach Biomolecular Structure Prediction and Design.

Biomolecular structure drives function, and computational capabilities have progressed such that the prediction and computational design of biomolecular structures is increasingly feasible. Because computational biophysics attracts students from many different backgrounds and with different levels of resources, teaching the subject can be challenging. One strategy to teach diverse learners is with interactive multimedia material that promotes self-paced, active learning. We have created a hands-on education strategy with a set of fifteen modules that teach topics in biomolecular structure and design, from fundamentals of conformational sampling and energy evaluation to applications like protein docking, antibody design, and RNA structure prediction. Our modules are based on PyRosetta, a Python library that encapsulates all computational modules and methods in the Rosetta software package. The workshop-style modules are implemented as Jupyter Notebooks that can be executed in the Google Colaboratory, allowing learners access with just a web browser. The digital format of Jupyter Notebooks allows us to embed images, molecular visualization movies, and interactive coding exercises. This multimodal approach may better reach students from different disciplines and experience levels as well as attract more researchers from smaller labs and cognate backgrounds to leverage PyRosetta in their science and engineering research. All materials are freely available at https://github.com/RosettaCommons/PyRosetta.notebooks.

[1]  Jason W. Labonte,et al.  Novel sampling strategies and a coarse-grained score function for docking homomers, flexible heteromers, and oligosaccharides using Rosetta in CAPRI Rounds 37–45 , 2019, bioRxiv.

[2]  D. Baker,et al.  Computational Design of Self-Assembling Protein Nanomaterials with Atomic Level Accuracy , 2012, Science.

[3]  Brian Kuhlman,et al.  Engineering a protein–protein interface using a computationally designed library , 2010, Proceedings of the National Academy of Sciences.

[4]  Brian D. Weitzner,et al.  Integration of the Rosetta suite with the python software stack via reproducible packaging and core programming interfaces for distributed simulation , 2019, Protein science : a publication of the Protein Society.

[5]  D. Baker,et al.  The coming of age of de novo protein design , 2016, Nature.

[6]  Minjae Lee,et al.  RNA design rules from a massive open laboratory , 2014, Proceedings of the National Academy of Sciences.

[7]  Jeffrey J. Gray,et al.  De novo design of peptide-calcite biomineralization systems. , 2010, Journal of the American Chemical Society.

[8]  Michelle K. Smith,et al.  Active learning increases student performance in science, engineering, and mathematics , 2014, Proceedings of the National Academy of Sciences.

[9]  Jose M. Duarte,et al.  Assessment of protein assembly prediction in CASP13 , 2019, Proteins.

[10]  Nicholas B Rego,et al.  3Dmol.js: molecular visualization with WebGL , 2014, Bioinform..

[11]  M Baaden,et al.  Molecular modelling as the spark for active learning approaches for interdisciplinary biology teaching , 2018, Interface Focus.

[12]  D. Baker,et al.  Controlling protein assembly on inorganic crystals through designed protein interfaces , 2019, Nature.

[13]  Brian D. Weitzner,et al.  Blind prediction performance of RosettaAntibody 3.0: Grafting, relaxation, kinematic loop modeling, and full CDR optimization , 2014, Proteins.

[14]  Morton M. Denn,et al.  Introduction to chemical engineering analysis , 1972 .

[15]  Brian D. Weitzner,et al.  An Integrated Framework Advancing Membrane Protein Modeling and Design , 2015, PLoS Comput. Biol..

[16]  D. van der Spoel,et al.  GROMACS: A message-passing parallel molecular dynamics implementation , 1995 .

[17]  Bernard R. Brooks,et al.  Web-Based Computational Chemistry Education with CHARMMing I: Lessons and Tutorial , 2014, PLoS Comput. Biol..

[18]  David Baker,et al.  Computational design of trimeric influenza neutralizing proteins targeting the hemagglutinin receptor binding site , 2017, Nature Biotechnology.

[19]  David Baker,et al.  Macromolecular modeling with rosetta. , 2008, Annual review of biochemistry.

[20]  D. Baker,et al.  Atomic accuracy in predicting and designing non-canonical RNA structure , 2010, Nature Methods.

[21]  David Baker,et al.  Accurate design of co-assembling multi-component protein nanomaterials , 2014, Nature.

[22]  Thomas L. Griffiths,et al.  nbgrader: A Tool for Creating and Grading Assignments in the Jupyter Notebook , 2019, Journal of Open Source Education.

[23]  M Karplus,et al.  The Levinthal paradox: yesterday and today. , 1997, Folding & design.

[24]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[25]  Jianpeng Ma,et al.  CHARMM: The biomolecular simulation program , 2009, J. Comput. Chem..

[26]  Jens Meiler,et al.  RosettaScripts: A Scripting Language Interface to the Rosetta Macromolecular Modeling Suite , 2011, PloS one.

[27]  Samuel L. DeLuca,et al.  Small-molecule ligand docking into comparative models with Rosetta , 2013, Nature Protocols.

[28]  Brian D. Weitzner,et al.  Real-Time PyMOL Visualization for Rosetta and PyRosetta , 2011, PloS one.

[29]  Timothy A. Whitehead,et al.  Computational Design of Proteins Targeting the Conserved Stem Region of Influenza Hemagglutinin , 2011, Science.

[30]  Peter A. Kollman,et al.  AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules , 1995 .

[31]  John P. Overington,et al.  How many drug targets are there? , 2006, Nature Reviews Drug Discovery.

[32]  Sergey Lyskov,et al.  PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta , 2010, Bioinform..

[33]  Rhiju Das,et al.  Modeling complex RNA tertiary folds with Rosetta. , 2015, Methods in enzymology.

[34]  Frank DiMaio,et al.  Protein structure prediction using Rosetta in CASP12 , 2018, Proteins.

[35]  Andrew Leaver-Fay,et al.  A cyber-linked undergraduate research experience in computational biomolecular structure prediction and design , 2017, PLoS Comput. Biol..

[36]  Frank Vahid,et al.  Effectiveness of Online Textbooks vs. Interactive Web-Native Content , 2014 .

[37]  David T Jones,et al.  Recent developments in deep learning applied to protein structure prediction , 2019, Proteins.

[38]  Gui-Bin Bian,et al.  Performance Analysis of Google Colaboratory as a Tool for Accelerating Deep Learning Applications , 2018, IEEE Access.

[39]  O. Sejersted Nobel Prize for Chemistry , 1937, Nature.

[40]  Matthew Rocklin,et al.  Dask: Parallel Computation with Blocked algorithms and Task Scheduling , 2015, SciPy.

[41]  Elisabeth L. Humphris,et al.  Prediction of protein-protein interface sequence diversity using flexible backbone computational protein design. , 2008, Structure.

[42]  David Baker,et al.  Building de novo cryo-electron microscopy structures collaboratively with citizen scientists , 2019, PLoS biology.

[43]  Brian D. Weitzner,et al.  Modeling and docking of antibody structures with Rosetta , 2017, Nature Protocols.

[44]  Brian D. Weitzner,et al.  Macromolecular modeling and design in Rosetta: recent methods and frameworks , 2020, Nature Methods.

[45]  D. Baker,et al.  Principles for designing ideal protein structures , 2012, Nature.

[46]  Dima Kozakov,et al.  FlexPepDock lessons from CAPRI peptide–protein rounds and suggested new criteria for assessment of model quality and utility , 2017, Proteins.

[47]  William Sheffler,et al.  Efficient Flexible Backbone Protein-Protein Docking for Challenging Targets , 2017, bioRxiv.

[48]  B. Kuhlman,et al.  Design of structurally distinct proteins using strategies inspired by evolution , 2016, Science.

[49]  Jens Meiler,et al.  ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. , 2011, Methods in enzymology.

[50]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..