Bridging the gap between single-template and fragment based protein structure modeling using Spanner

Background: As the coverage of experimentally determined protein structures increases, fragment-based structural modeling approaches are expected to play an ever more important role in structural modeling. Here we introduce a structural modeling method by which an initial structural template can be extended by the addition of structural fragments to more closely match an aligned query sequence. A database of pro-tein fragments indexed by their internal coordinates was created and a novel methodology for their retrieval was implemented. After fragment selection and assembly, sidechains are replaced and the all-atom model is refined by restrained energy minimization. We implemented the proposed method in the program Span-ner and benchmarked it using a previously published set of 367 immunoglobulin (Ig) loops, 206 historical query-template pairs and alignments from the Critical Assessment of protein Structure Prediction (CASP) experiment, and 217 structural alignments between remotely homologous query-template pairs. The con-straint-based modeling software MODELLER and previously reported results for RosettaAntibody, were used as references. Results: The error in the modeled structures was assessed by root-mean square deviation (RMSD) from the native structure, as a function of the query-template sequence identity. For the Ig benchmark set, for which a single fragment was used to model each loop, the average RMSD for Spanner (3 +/- 1.5 A) was found to lie midway between that of MODELLER (4 +/- 2 A) and RosettaAntibody (2 +/- 1 A). For the CASP and structural alignment benchmarks, for which gaps represent a small fraction of the modeled residues, the difference between Spanner and MODELLER were much smaller then the standard deviations of either program. The Spanner web server and source code are available at http://sysimm.ifrec.osaka-u.ac.jp/Spanner/. Conclusions: For typical homology modeling, Spanner is at least as good, on average as the template-free constraint-driven approach used by MODELLER. The Ig model results suggest that when gap regions represent a significant fraction of the alignment, Spanner’s efficient use of fragment libraries, along with local sequence and secondary structural information, significantly improve model accuracy without a dra-matic increase in computational cost.

[1]  Haruki Nakamura,et al.  Zc3h12a is an RNase essential for controlling immune responses by regulating mRNA decay , 2009, Nature.

[2]  A. Kidera,et al.  Determinants of protein side‐chain packing , 1994, Protein science : a publication of the Protein Society.

[3]  K. Nishikawa,et al.  Protein structure comparison using the Markov transition model of evolution , 2000, Proteins.

[4]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[5]  Haruki Nakamura,et al.  Presto(protein Engineering Simulator): A Vectorized Molecular Mechanics Program for Biopolymers , 1992, Comput. Chem..

[6]  Yoonjoo Choi,et al.  FREAD revisited: Accurate loop structure prediction using a database search algorithm , 2010, Proteins.

[7]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[8]  P. Argos,et al.  Knowledge‐based protein secondary structure assignment , 1995, Proteins.

[9]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .

[10]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[11]  Akira R. Kinjo,et al.  SeSAW: balancing sequence and structural information in protein functional mapping , 2010, Bioinform..

[12]  J. Skolnick,et al.  Ab initio modeling of small proteins by iterative TASSER simulations , 2007, BMC Biology.

[13]  Torsten Schwede,et al.  BIOINFORMATICS Bioinformatics Advance Access published November 12, 2005 The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling , 2022 .

[14]  Haruki Nakamura,et al.  Structural classification of CDR‐H3 revisited: A lesson in antibody modeling , 2008, Proteins.

[15]  Jeffrey J. Gray,et al.  Toward high‐resolution homology modeling of antibody Fv regions and application to antibody–antigen docking , 2009, Proteins.

[16]  Haruki Nakamura,et al.  ASH structure alignment package: Sensitivity and selectivity in domain classification , 2007, BMC Bioinformatics.

[17]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[18]  Gerrit Groenhof,et al.  GROMACS: Fast, flexible, and free , 2005, J. Comput. Chem..