Exploring the repeat protein universe through computational protein design

A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. Here we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix–loop–helix–loop structural motif. Eighty-three designs with sequences unrelated to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering.

[1]  Dominique Durand,et al.  Design, production and molecular structure of a new family of artificial alpha-helicoidal repeat proteins (αRep) based on thermostable HEAT-like repeats. , 2010, Journal of molecular biology.

[2]  Harry Perkins,et al.  An artificial PPR scaffold for programmable RNA recognition , 2014 .

[3]  Doug Barrick,et al.  Enhancing the stability and folding rate of a repeat protein through the addition of consensus repeats. , 2007, Journal of molecular biology.

[4]  John A Tainer,et al.  Super-resolution in solution X-ray scattering and its applications to structural systems biology. , 2013, Annual review of biophysics.

[5]  Michael Blaber,et al.  Experimental support for the evolution of symmetric protein architecture from a simple peptide motif , 2010, Proceedings of the National Academy of Sciences.

[6]  A. Plückthun,et al.  Designed Armadillo repeat proteins: library generation, characterization and selection of peptide binders with high specificity. , 2012, Journal of molecular biology.

[7]  Seungpyo Hong,et al.  Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering , 2012, Proceedings of the National Academy of Sciences.

[8]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[9]  John A. Tainer,et al.  Implementation and performance of SIBYLS: a dual endstation small-angle X-ray scattering and macromolecular crystallography beamline at the Advanced Light Source , 2013, Journal of applied crystallography.

[10]  John A. Tainer,et al.  Accurate assessment of mass, models and resolution by small-angle scattering , 2013, Nature.

[11]  John A. Tainer,et al.  Robust, high-throughput solution structural analyses by small angle X-ray scattering (SAXS) , 2009, Nature Methods.

[12]  Z. Peng,et al.  Consensus-derived structural determinants of the ankyrin repeat motif , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Dan S. Tawfik,et al.  Reconstruction of functional beta-propeller lectins via homo-oligomeric assembly of shorter fragments. , 2007, Journal of molecular biology.

[14]  B. Kobe,et al.  When protein folding is simplified to protein coiling: the continuum of solenoid protein structures. , 2000, Trends in biochemical sciences.

[15]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[16]  David Baker,et al.  Control of repeat protein curvature by computational protein design , 2014, Nature Structural &Molecular Biology.

[17]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[18]  D. Eisenberg,et al.  A census of protein repeats. , 1999, Journal of molecular biology.

[19]  Andreas Plückthun,et al.  Designed armadillo repeat proteins as general peptide-binding scaffolds: consensus design and computational optimization of the hydrophobic core. , 2008, Journal of molecular biology.

[20]  John A Tainer,et al.  Comprehensive macromolecular conformations mapped by quantitative SAXS analyses , 2013, Nature Methods.

[21]  A. Plückthun,et al.  High-affinity binders selected from designed ankyrin repeat protein libraries , 2004, Nature Biotechnology.

[22]  John A Tainer,et al.  Accurate SAXS profile computation and its assessment by contrast variation experiments. , 2013, Biophysical journal.

[23]  Jens Meiler,et al.  ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. , 2011, Methods in enzymology.

[24]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[25]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[26]  John A. Tainer,et al.  Software for the high-throughput collection of SAXS data using an enhanced Blu-Ice/DCS control system , 2010, Journal of synchrotron radiation.

[27]  Andreas Plückthun,et al.  Designing repeat proteins: well-expressed, soluble and stable proteins from combinatorial libraries of consensus ankyrin repeat proteins. , 2003, Journal of molecular biology.

[28]  Andrej Sali,et al.  FoXS: a web server for rapid computation and fitting of SAXS profiles , 2010, Nucleic Acids Res..

[29]  Yong Xiong,et al.  Design of stable alpha-helical arrays from an idealized TPR motif. , 2003, Structure.

[30]  David Baker,et al.  A general computational approach for repeat protein design. , 2015, Journal of molecular biology.

[31]  Geoffrey J. Barton,et al.  Jalview Version 2—a multiple sequence alignment editor and analysis workbench , 2009, Bioinform..

[32]  Kevin Cowtan,et al.  research papers Acta Crystallographica Section D Biological , 2005 .

[33]  Maxim V. Petoukhov,et al.  New developments in the ATSAS program package for small-angle scattering data analysis , 2012, Journal of applied crystallography.

[34]  Andreas Plückthun,et al.  Folding and unfolding mechanism of highly stable full-consensus ankyrin repeat proteins. , 2008, Journal of molecular biology.

[35]  Hiroki Noguchi,et al.  Computational design of a self-assembling symmetrical β-propeller protein , 2014, Proceedings of the National Academy of Sciences.

[36]  Silvio C. E. Tosatto,et al.  RepeatsDB: a database of tandem repeat protein structures , 2013, Nucleic Acids Res..

[37]  Randy J Read,et al.  Electronic Reprint Biological Crystallography Phenix: Building New Software for Automated Crystallographic Structure Determination Biological Crystallography Phenix: Building New Software for Automated Crystallographic Structure Determination , 2022 .

[38]  Aitziber L Cortajarena,et al.  Designed proteins to modulate cellular networks. , 2010, ACS chemical biology.

[39]  D. Baker,et al.  De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy , 2015, Nature chemical biology.

[40]  Andrey V Kajava,et al.  Tandem repeats in proteins: from sequence to structure. , 2012, Journal of structural biology.

[41]  D. Baker,et al.  RosettaRemodel: A Generalized Framework for Flexible Backbone Protein Design , 2011, PloS one.

[42]  Vincent B. Chen,et al.  Correspondence e-mail: , 2000 .

[43]  Jonas Martinsson,et al.  Computational design of a leucine-rich repeat protein with a predefined geometry , 2014, Proceedings of the National Academy of Sciences.

[44]  D. Svergun,et al.  CRYSOL : a program to evaluate X-ray solution scattering of biological macromolecules from atomic coordinates , 1995 .

[45]  D. Baker,et al.  High thermodynamic stability of parametrically designed helical bundles , 2014, Science.

[46]  Aitziber L Cortajarena,et al.  Calorimetric study of a series of designed repeat proteins: Modular structure and modular folding , 2011, Protein science : a publication of the Protein Society.

[47]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.