Analytical Approaches to Improve Accuracy in Solving the Protein Topology Problem

To take advantage of recent advances in genomics and proteomics it is critical that the three-dimensional physical structure of biological macromolecules be determined. Cryo-Electron Microscopy (cryo-EM) is a promising and improving method for obtaining this data, however resolution is often not sufficient to directly determine the atomic scale structure. Despite this, information for secondary structure locations is detectable. De novo modeling is a computational approach to modeling these macromolecular structures based on cryo-EM derived data. During de novo modeling a mapping between detected secondary structures and the underlying amino acid sequence must be identified. DP-TOSS (Dynamic Programming for determining the Topology Of Secondary Structures) is one tool that attempts to automate the creation of this mapping. By treating the correspondence between the detected structures and the structures predicted from sequence data as a constraint graph problem DP-TOSS achieved good accuracy in its original iteration. In this paper, we propose modifications to the scoring methodology of DP-TOSS to improve its accuracy. Three scoring schemes were applied to DP-TOSS and tested: (i) a skeleton-based scoring function; (ii) a geometry-based analytical function; and (iii) a multi-well potential energy-based function. A test of 25 proteins shows that a combination of these schemes can improve the performance of DP-TOSS to solve the topology determination problem for macromolecule proteins.

[1]  Jing He,et al.  Incorporating constraints from low resolution density map in ab initio structure prediction using Rosetta , 2007, 2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[2]  Yonggang Lu,et al.  Deriving Topology and Sequence Alignment for the Helix Skeleton in Low-Resolution protein Density Maps , 2008, J. Bioinform. Comput. Biol..

[3]  J. Mccammon,et al.  Situs: A package for docking crystal structures into low-resolution maps from electron microscopy. , 1999, Journal of structural biology.

[4]  N. Volkmann,et al.  Quantitative fitting of atomic models into observed densities derived by electron microscopy. , 1999, Journal of structural biology.

[5]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[6]  David Baker,et al.  Protein structure prediction and analysis using the Robetta server , 2004, Nucleic Acids Res..

[7]  Karsten Suhre,et al.  NORMA: a tool for flexible fitting of high-resolution protein structures into low-resolution electron-microscopy-derived density maps. , 2006, Acta crystallographica. Section D, Biological crystallography.

[8]  P. Chacón,et al.  Multi-resolution contour-based fitting of macromolecular structures. , 2002, Journal of molecular biology.

[9]  M. Baker,et al.  Structural characterization of components of protein assemblies by comparative modeling and electron cryo-microscopy. , 2005, Journal of structural biology.

[10]  Jianlin Cheng,et al.  CONFOLD: Residue‐residue contact‐guided ab initio protein folding , 2015, Proteins.

[11]  J Frank,et al.  Domain motions of EF-G bound to the 70S ribosome: insights from a hand-shaking between multi-resolution structures. , 2000, Biophysical journal.

[12]  Aoife McLysaght,et al.  Porter: a new, accurate server for protein secondary structure prediction , 2005, Bioinform..

[13]  Jing He,et al.  Native secondary structure topology has near minimum contact energy among all possible geometrically constrained topologies , 2009, Proteins.

[14]  M. Baker,et al.  Refinement of protein structures by iterative comparative modeling and CryoEM density fitting. , 2006, Journal of molecular biology.

[15]  Petra Fromme,et al.  Fitting low-resolution cryo-EM maps of proteins using constrained geometric simulations. , 2008, Biophysical journal.

[16]  Joachim Frank,et al.  Ribosome dynamics: insights from atomic structure modeling into cryo-electron microscopy maps. , 2006, Annual review of biophysics and biomolecular structure.

[17]  James M Aramini,et al.  Assessment of template‐based protein structure predictions in CASP10 , 2014, Proteins.

[18]  Desh Ranjan,et al.  Solving the Secondary Structure Matching Problem in Cryo-EM De Novo Modeling Using a Constrained $K$-Shortest Path Graph Algorithm , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[19]  Badri Adhikari,et al.  CONFOLD: residue-residue contact-guided ab initio protein folding , 2015 .

[20]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[21]  F. Tama,et al.  Normal mode based flexible fitting of high-resolution structure into low-resolution experimental data from cryo-EM. , 2004, Journal of structural biology.

[22]  Alberto Santamaría-Pang,et al.  Flexible fitting in 3D-EM guided by the structural variability of protein superfamilies. , 2006, Structure.

[23]  Kamal Al-Nasr,et al.  Structure prediction for the helical skeletons detected from the low resolution protein density map , 2010, BMC Bioinformatics.

[24]  Desh Ranjan,et al.  Ranking Valid Topologies of the Secondary Structure Elements Using a Constraint Graph , 2011, J. Bioinform. Comput. Biol..

[25]  Alexandre M. J. J. Bonvin,et al.  Fast and sensitive rigid-body fitting into cryo-EM density maps with PowerFit , 2015 .

[26]  M. Baker,et al.  Bridging the information gap: computational tools for intermediate resolution structure interpretation. , 2001, Journal of molecular biology.

[27]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[28]  H. Scheraga,et al.  Packing helices in proteins by global optimization of a potential energy function , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Ben M. Webb,et al.  Protein structure fitting and refinement guided by cryo-EM density. , 2008, Structure.

[30]  Dong Si,et al.  Tracing beta strands using StrandTwister from cryo-EM density maps at medium resolutions. , 2014, Structure.

[31]  W Wriggers,et al.  Modeling tricks and fitting techniques for multiresolution structures. , 2001, Structure.

[32]  H. Wolfson,et al.  EMatch: Discovery of High Resolution Structural Homologues of Protein Domains in Intermediate Resolution Cryo-EM Maps , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[33]  Kamal Al-Nasr,et al.  PEM-fitter: A Coarse-Grained Method to Validate Protein Candidate Models , 2018, J. Comput. Biol..

[34]  M. Baker,et al.  Modeling protein structure at near atomic resolutions with Gorgon. , 2011, Journal of structural biology.

[35]  Dong Si,et al.  A machine learning approach for the identification of protein secondary structure elements from electron cryo-microscopy density maps. , 2012, Biopolymers.

[36]  P. Stewart,et al.  EM-fold: De novo folding of alpha-helical proteins guided by intermediate-resolution electron microscopy density maps. , 2009, Structure.

[37]  Marta M. B. Pascoal,et al.  Deviation Algorithms for Ranking Shortest Paths , 1999, Int. J. Found. Comput. Sci..

[38]  Kamal Al-Nasr,et al.  An efficient method for validating protein models using electron microscopy data , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[39]  Jianpeng Ma,et al.  Domain movements in human fatty acid synthase by quantized elastic deformational model , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Kamal Al-Nasr,et al.  Geometry Analysis for Protein Secondary Structures Matching Problem , 2017, BCB.

[41]  Wei Xie,et al.  Residue-rotamer-reduction algorithm for the protein side-chain conformation problem , 2006, Bioinform..

[42]  Legand Burge,et al.  Intensity-Based Skeletonization of CryoEM Gray-Scale Images Using a True Segmentation-Free Algorithm , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[43]  Kamal Al-Nasr,et al.  Constrained cyclic coordinate descent for cryo-EM images at medium resolutions: beyond the protein loop closure problem , 2016, Robotica.

[44]  J. Frank Single-particle reconstruction of biological macromolecules in electron microscopy – 30 years , 2009, Quarterly Reviews of Biophysics.

[45]  Enrico Pontelli,et al.  Identification of alpha-helices from low resolution protein density maps. , 2006, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[46]  Alan Brown,et al.  Tools for macromolecular model building and refinement into electron cryo-microscopy reconstructions , 2015, Acta crystallographica. Section D, Biological crystallography.

[47]  Kamal Al-Nasr,et al.  An effective convergence independent loop closure method using Forward-Backward Cyclic Coordinate Descent , 2009, Int. J. Data Min. Bioinform..

[48]  S Birmanns,et al.  Using situs for flexible and rigid-body fitting of multiresolution single-molecule data. , 2001, Journal of structural biology.

[49]  J. Y. Yen,et al.  Finding the K Shortest Loopless Paths in a Network , 2007 .

[50]  Desh Ranjan,et al.  An Effective Computational Method Incorporating Multiple Secondary Structure Predictions in Topology Determination for Cryo-EM Images , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[51]  Niels Volkmann,et al.  Evidence for cleft closure in actomyosin upon ADP release , 2000, Nature Structural Biology.

[52]  D. Baker,et al.  Refinement of protein structures into low-resolution density maps using rosetta. , 2009, Journal of molecular biology.

[53]  Legand Burge,et al.  A Graph Approach to Bridge the Gaps in Volumetric Electron Cryo-microscopy Skeletons , 2013, ISBRA.

[54]  Alessandro Dal Palù,et al.  A constraint logic programming approach to 3D structure determination of large protein complexes , 2006, SAC '06.

[55]  M. Rossmann,et al.  Fitting atomic models into electron-microscopy maps. , 2000, Acta crystallographica. Section D, Biological crystallography.

[56]  Matthew L. Baker,et al.  Shape modeling and matching in identifying 3D protein structures , 2008, Comput. Aided Des..

[57]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[58]  P. Stewart,et al.  EM-fold: de novo atomic-detail protein structure determination from medium-resolution density maps. , 2012, Structure.

[59]  Tao Ju,et al.  Interactive skeletonization of intensity volumes , 2009, The Visual Computer.

[60]  Jian Peng,et al.  Template-based protein structure modeling using the RaptorX web server , 2012, Nature Protocols.

[61]  Pablo Chacón,et al.  Structural modeling from electron microscopy data , 2015 .

[62]  M. Thorpe,et al.  Constrained geometric simulation of diffusive motion in proteins , 2005, Physical biology.

[63]  Jianpeng Ma,et al.  Determining protein topology from skeletons of secondary structures. , 2005, Journal of molecular biology.

[64]  D. Ming,et al.  How to describe protein motion without amino acid sequence and atomic coordinates , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[65]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[66]  Thomas D. Goddard,et al.  Quantitative analysis of cryo-EM density map segmentation by watershed and scale-space filtering, and fitting of structures by alignment to regions. , 2010, Journal of structural biology.

[67]  Enrico Pontelli,et al.  A Parallel Algorithm for Helix Mapping Between 3D and 1D Protein Structure Using the Length Constraints , 2004, ISPA.

[68]  David P. Doane,et al.  Measuring Skewness: A Forgotten Statistic? , 2011 .

[69]  Desh Ranjan,et al.  Building the initial chain of the proteins through de novo modeling of the cryo-electron microscopy volume data at the medium resolutions , 2012, BCB.

[70]  Michael Levitt,et al.  Combining efficient conformational sampling with a deformable elastic network model facilitates structure refinement at low resolution. , 2007, Structure.

[71]  A. Fiser Template-based protein structure modeling. , 2010, Methods in molecular biology.

[72]  Roland L. Dunbrack,et al.  Backbone-dependent rotamer library for proteins. Application to side-chain prediction. , 1993, Journal of molecular biology.

[73]  M. Baker,et al.  Identification of secondary structure elements in intermediate-resolution density maps. , 2007, Structure.