Estimating loop length from CryoEM images at medium resolutions

BackgroundDe novo protein modeling approaches utilize 3-dimensional (3D) images derived from electron cryomicroscopy (CryoEM) experiments. The skeleton connecting two secondary structures such as α-helices represent the loop in the 3D image. The accuracy of the skeleton and of the detected secondary structures are critical in De novo modeling. It is important to measure the length along the skeleton accurately since the length can be used as a constraint in modeling the protein.ResultsWe have developed a novel computational geometric approach to derive a simplified curve in order to estimate the loop length along the skeleton. The method was tested using fifty simulated density images of helix-loop-helix segments of atomic structures and eighteen experimentally derived density data from Electron Microscopy Data Bank (EMDB). The test using simulated density maps shows that it is possible to estimate within 0.5Å of the expected length for 48 of the 50 cases. The experiments, involving eighteen experimentally derived CryoEM images, show that twelve cases have error within 2Å.ConclusionsThe tests using both simulated and experimentally derived images show that it is possible for our proposed method to estimate the loop length along the skeleton if the secondary structure elements, such as α-helices, can be detected accurately, and there is a continuous skeleton linking the α-helices.

[1]  Matthew L. Baker,et al.  Ab Initio Modeling of the Herpesvirus VP26 Core Domain Assessed by CryoEM Density , 2006, PLoS Comput. Biol..

[2]  P. Stewart,et al.  EM-fold: de novo atomic-detail protein structure determination from medium-resolution density maps. , 2012, Structure.

[3]  M. Baker,et al.  Bridging the information gap: computational tools for intermediate resolution structure interpretation. , 2001, Journal of molecular biology.

[4]  John D. Westbrook,et al.  EMDataBank.org: unified data resource for CryoEM , 2010, Nucleic Acids Res..

[5]  Jing He,et al.  Native secondary structure topology has near minimum contact energy among all possible geometrically constrained topologies , 2009, Proteins.

[6]  Desh Ranjan,et al.  Building the initial chain of the proteins through de novo modeling of the cryo-electron microscopy volume data at the medium resolutions , 2012, BCB.

[7]  Andrey N. Chernikov,et al.  CryoEM skeleton length estimation using a decimated curve , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[8]  Bernard F. Buxton,et al.  Secondary structure prediction with support vector machines , 2003, Bioinform..

[9]  Jianpeng Ma,et al.  A Structural-informatics approach for tracing beta-sheets: building pseudo-C(alpha) traces for beta-strands in intermediate-resolution density maps. , 2004, Journal of molecular biology.

[10]  M. Baker,et al.  Modeling protein structure at near atomic resolutions with Gorgon. , 2011, Journal of structural biology.

[11]  Dinesh Manocha,et al.  Proceedings of the 2007 ACM Symposium on Solid and Physical Modeling, Beijing, China, June 4-6, 2007 , 2007, Symposium on Solid and Physical Modeling.

[12]  D. Whitford,et al.  Proteins: Structure and Function , 2005, Annals of Biomedical Engineering.

[13]  Enrico Pontelli,et al.  Identification of alpha-helices from low resolution protein density maps. , 2006, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[14]  Jing He,et al.  Incorporating constraints from low resolution density map in ab initio structure prediction using Rosetta , 2007, 2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[15]  P. Stewart,et al.  EM-fold: De novo folding of alpha-helical proteins guided by intermediate-resolution electron microscopy density maps. , 2009, Structure.

[16]  Aoife McLysaght,et al.  Porter: a new, accurate server for protein secondary structure prediction , 2005, Bioinform..

[17]  Desh Ranjan,et al.  Ranking Valid Topologies of the Secondary Structure Elements Using a Constraint Graph , 2011, J. Bioinform. Comput. Biol..

[18]  J. Hershberger,et al.  Speeding Up the Douglas-Peucker Line-Simplification Algorithm , 1992 .

[19]  Qinfen Zhang,et al.  CryoEM structure of the mature dengue virus at 3.5-Å resolution , 2012, Nature Structural &Molecular Biology.

[20]  Didier Raoult,et al.  Structure of Sputnik, a virophage, at 3.5-Å resolution , 2012, Proceedings of the National Academy of Sciences.

[21]  Andrei L Lomize,et al.  Bmc Structural Biology , 2022 .

[22]  Matthew L. Baker,et al.  Shape modeling and matching in identifying protein structure from low-resolution images , 2007, Symposium on Solid and Physical Modeling.

[23]  Zeyun Yu,et al.  Computational Approaches for Automatic Structural Analysis of Large Biomolecular Complexes , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[25]  Zeyun Yu,et al.  Computational Approaches for Automatic Structural Analysis of Large Biomolecular Complexes , 2008, TCBB.

[26]  Pierre Baldi,et al.  Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles , 2002, Proteins.

[27]  Jing He,et al.  IDENTIFICATION OF α-HELICES FROM LOW RESOLUTION PROTEIN DENSITY MAPS , 2006 .

[28]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[29]  M. Baker,et al.  Identification of secondary structure elements in intermediate-resolution density maps. , 2007, Structure.

[30]  Dong Si,et al.  A machine learning approach for the identification of protein secondary structure elements from electron cryo-microscopy density maps. , 2012, Biopolymers.

[31]  Thomas K. Peucker,et al.  2. Algorithms for the Reduction of the Number of Points Required to Represent a Digitized Line or its Caricature , 2011 .

[32]  David H. Douglas,et al.  ALGORITHMS FOR THE REDUCTION OF THE NUMBER OF POINTS REQUIRED TO REPRESENT A DIGITIZED LINE OR ITS CARICATURE , 1973 .

[33]  Catherine L Lawson,et al.  Unified data resource for cryo-EM. , 2010, Methods in enzymology.

[34]  Desh Ranjan,et al.  Improved Efficiency in Cryo-EM Secondary Structure Topology Determination from Inaccurate Data , 2012, J. Bioinform. Comput. Biol..

[35]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..

[36]  Matthew L. Baker,et al.  Computing a Family of Skeletons of Volumetric Models for Shape Description , 2006, GMP.

[37]  Remco C. Veltkamp,et al.  Shape matching: similarity measures and algorithms , 2001, Proceedings International Conference on Shape Modeling and Applications.

[38]  W Chiu,et al.  EMAN: semiautomated software for high-resolution single-particle reconstructions. , 1999, Journal of structural biology.

[39]  Jianpeng Ma,et al.  A Structural-informatics approach for tracing beta-sheets: building pseudo-C(alpha) traces for beta-strands in intermediate-resolution density maps. , 2004, Journal of molecular biology.