Building the initial chain of the proteins through de novo modeling of the cryo-electron microscopy volume data at the medium resolutions

Cryo-electron Microscopy (cryoEM) is an advanced imaging technique that produces volume maps at different resolutions. This technique is capable of visualizing large molecular complexes such as viruses and ribosomes. At the medium resolutions, such as 5 to 10Å, the location and orientation of the secondary structure elements (SSEs) can be computationally identified. However, there is no registration between the detected SSEs and the protein sequence, and therefore it is challenging to derive the atomic structure from such volume data. We present, in this paper, the preliminary results of the full-atom protein chains using our de novo modeling framework. The framework has multiple components including the ranking of topologies, the construction of helices and loops along the density traces, and the energy evaluation of the structure. A test containing thirteen simulated density maps and two experimentally derived density maps show that the true topology was ranked among the top 35 of the huge topological space. The best atomic model of the true topology was ranked within the top 40 for twelve of the fifteen proteins tested. The average backbone RMSD100 of these models is about 4Å for the fifteen proteins.

[1]  Matthew L. Baker,et al.  Ab Initio Modeling of the Herpesvirus VP26 Core Domain Assessed by CryoEM Density , 2006, PLoS Comput. Biol..

[2]  Chaok Seok,et al.  A kinematic view of loop closure , 2004, J. Comput. Chem..

[3]  Jing He,et al.  Incorporating constraints from low resolution density map in ab initio structure prediction using Rosetta , 2007, 2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[4]  Yonggang Lu,et al.  Deriving Topology and Sequence Alignment for the Helix Skeleton in Low-Resolution protein Density Maps , 2008, J. Bioinform. Comput. Biol..

[5]  Kamal Al-Nasr,et al.  Structure prediction for the helical skeletons detected from the low resolution protein density map , 2010, BMC Bioinformatics.

[6]  Desh Ranjan,et al.  Ranking Valid Topologies of the Secondary Structure Elements Using a Constraint Graph , 2011, J. Bioinform. Comput. Biol..

[7]  Lydia E. Kavraki,et al.  Randomized path planning for linkages with closed kinematic chains , 2001, IEEE Trans. Robotics Autom..

[8]  Wen Jiang,et al.  Deriving folds of macromolecular complexes through electron cryomicroscopy and bioinformatics approaches. , 2002, Current opinion in structural biology.

[9]  Lydia E. Kavraki,et al.  Probabilistic roadmaps for path planning in high-dimensional configuration spaces , 1996, IEEE Trans. Robotics Autom..

[10]  Dinesh Manocha,et al.  Efficient inverse kinematics for general 6R manipulators , 1994, IEEE Trans. Robotics Autom..

[11]  M. Baker,et al.  Identification of secondary structure elements in intermediate-resolution density maps. , 2007, Structure.

[12]  Thierry Siméon,et al.  A random loop generator for planning the motions of closed kinematic chains using PRM methods , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[13]  Jean-Claude Latombe,et al.  Computing Protein Structures from Electron Density Maps: The Missing Fragment Problem , 2004, WAFR.

[14]  S. Pongor,et al.  A normalized root‐mean‐spuare distance for comparing protein three‐dimensional structures , 2001, Protein science : a publication of the Protein Society.

[15]  Barry Honig,et al.  Backbone model of an aquareovirus virion by cryo-electron microscopy and bioinformatics. , 2010, Journal of molecular biology.

[16]  Aoife McLysaght,et al.  Porter: a new, accurate server for protein secondary structure prediction , 2005, Bioinform..

[17]  Dong Si,et al.  A machine learning approach for the identification of protein secondary structure elements from electron cryo-microscopy density maps. , 2012, Biopolymers.

[18]  P. Stewart,et al.  EM-fold: De novo folding of alpha-helical proteins guided by intermediate-resolution electron microscopy density maps. , 2009, Structure.

[19]  Adrian A Canutescu,et al.  Cyclic coordinate descent: A robotics algorithm for protein loop closure , 2003, Protein science : a publication of the Protein Society.

[20]  Kai Zhang,et al.  Atomic model of a cypovirus built from cryo-EM structure provides insight into the mechanism of mRNA capping , 2011, Proceedings of the National Academy of Sciences.

[21]  Jianpeng Ma,et al.  A Structural-informatics approach for tracing beta-sheets: building pseudo-C(alpha) traces for beta-strands in intermediate-resolution density maps. , 2004, Journal of molecular biology.

[22]  M. Baker,et al.  Modeling protein structure at near atomic resolutions with Gorgon. , 2011, Journal of structural biology.

[23]  Jianpeng Ma,et al.  A Structural-informatics approach for tracing beta-sheets: building pseudo-C(alpha) traces for beta-strands in intermediate-resolution density maps. , 2004, Journal of molecular biology.

[24]  Nancy M. Amato,et al.  A kinematics-based probabilistic roadmap method for high DOF closed chain systems , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[25]  B. Faverjon,et al.  Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[26]  N. Go,et al.  Ring Closure and Local Conformational Deformations of Chain Molecules , 1970 .

[27]  Jing He,et al.  IDENTIFICATION OF α-HELICES FROM LOW RESOLUTION PROTEIN DENSITY MAPS , 2006 .

[28]  M. Baker,et al.  Refinement of protein structures by iterative comparative modeling and CryoEM density fitting. , 2006, Journal of molecular biology.

[29]  Matthew L. Baker,et al.  Computing a Family of Skeletons of Volumetric Models for Shape Description , 2006, GMP.

[30]  Bernard F. Buxton,et al.  Secondary structure prediction with support vector machines , 2003, Bioinform..

[31]  Wah Chiu,et al.  Pushing back the limits of electron cryomicroscopy , 1997, Nature Structural Biology.

[32]  Jing He,et al.  Reduction of the secondary structure topological space through direct estimation of the contact energy formed by the secondary structures , 2009, BMC Bioinformatics.

[33]  D. Baker,et al.  Refinement of protein structures into low-resolution density maps using rosetta. , 2009, Journal of molecular biology.

[34]  Thierry Siméon,et al.  Geometric algorithms for the conformational analysis of long protein loops , 2004, J. Comput. Chem..

[35]  M. Baker,et al.  Bridging the information gap: computational tools for intermediate resolution structure interpretation. , 2001, Journal of molecular biology.

[36]  H. Scheraga,et al.  Exact analytical loop closure in proteins using polynomial equations , 1999 .

[37]  John D. Westbrook,et al.  EMDataBank.org: unified data resource for CryoEM , 2010, Nucleic Acids Res..

[38]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[39]  Jing He,et al.  Native secondary structure topology has near minimum contact energy among all possible geometrically constrained topologies , 2009, Proteins.

[40]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[41]  E. Coutsias,et al.  Sub-angstrom accuracy in protein loop reconstruction by robotics-inspired conformational sampling , 2009, Nature Methods.

[42]  Leonidas J. Guibas,et al.  Inverse Kinematics in Biology: The Protein Loop Closure Problem , 2005, Int. J. Robotics Res..

[43]  P. Stewart,et al.  EM-fold: de novo atomic-detail protein structure determination from medium-resolution density maps. , 2012, Structure.

[44]  Yaoqi Zhou,et al.  Achieving 80% ten‐fold cross‐validated accuracy for secondary structure prediction by large‐scale training , 2006, Proteins.

[45]  Kamal Al-Nasr,et al.  An effective convergence independent loop closure method using Forward-Backward Cyclic Coordinate Descent , 2009, Int. J. Data Min. Bioinform..

[46]  M. Baker,et al.  Structural characterization of components of protein assemblies by comparative modeling and electron cryo-microscopy. , 2005, Journal of structural biology.

[47]  Bernard Roth,et al.  Kinematic analysis of the 6R manipulator of general geometry , 1991 .

[48]  W. Chiu,et al.  Seeing GroEL at 6 A resolution by single particle electron cryomicroscopy. , 2004, Structure.

[49]  Xing Zhang,et al.  3.3 Å Cryo-EM Structure of a Nonenveloped Virus Reveals a Priming Mechanism for Cell Entry , 2010, Cell.

[50]  Ben M. Webb,et al.  Protein structure fitting and refinement guided by cryo-EM density. , 2008, Structure.

[51]  Wei Xie,et al.  Residue-rotamer-reduction algorithm for the protein side-chain conformation problem , 2006, Bioinform..

[52]  Enrico Pontelli,et al.  Identification of alpha-helices from low resolution protein density maps. , 2006, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[53]  W. Chiu,et al.  Seeing the herpesvirus capsid at 8.5 A. , 2000, Science.