Improved Efficiency in Cryo-EM Secondary Structure Topology Determination from Inaccurate Data

The determination of the secondary structure topology is a critical step in deriving the atomic structure from the protein density map obtained from electron cryo-microscopy technique. This step often relies on the matching of two sources of information. One source comes from the secondary structures detected from the protein density map at the medium resolution, such as 5-10 Å. The other source comes from the predicted secondary structures from the amino acid sequence. Due to the inaccuracy in either source of information, a pool of possible secondary structure positions needs to be sampled. This paper studies the question, that is, how to reduce the computation of the mapping when the inaccuracy of the secondary structure predictions is considered. We present a method that combines the concept of dynamic graph with our previous work of using constrained shortest path to identify the topology of the secondary structures. We show a reduction of 34.55% of run-time as comparison to the naïve way of handling the inaccuracies. We also show an improved accuracy when the potential secondary structure errors are explicitly sampled verses the use of one consensus prediction. Our framework demonstrated the potential of developing computationally effective exact algorithms to identify the optimal topology of the secondary structures when the inaccuracy of the predicted data is considered.

[1]  Z. Zhou,et al.  Towards atomic resolution structural determination by single-particle cryo-electron microscopy. , 2008, Current opinion in structural biology.

[2]  Jianpeng Ma,et al.  A structural-informatics approach for mining beta-sheets: locating sheets in intermediate-resolution density maps. , 2003, Journal of molecular biology.

[3]  Geoffrey J. Barton,et al.  JPred : a consensus secondary structure prediction server , 1999 .

[4]  P. Stewart,et al.  EM-fold: De novo folding of alpha-helical proteins guided by intermediate-resolution electron microscopy density maps. , 2009, Structure.

[5]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..

[6]  Kamal Al-Nasr,et al.  Structure prediction for the helical skeletons detected from the low resolution protein density map , 2010, BMC Bioinformatics.

[7]  B. Rost,et al.  Alignments grow, secondary structure prediction improves , 2002, Proteins.

[8]  Z. Zhou,et al.  Atomic resolution cryo electron microscopy of macromolecular complexes. , 2011, Advances in protein chemistry and structural biology.

[9]  W Chiu,et al.  EMAN: semiautomated software for high-resolution single-particle reconstructions. , 1999, Journal of structural biology.

[10]  Aleksey A. Porollo,et al.  Combining prediction of secondary structure and solvent accessibility in proteins , 2005, Proteins.

[11]  Desh Ranjan,et al.  Ranking Valid Topologies of the Secondary Structure Elements Using a Constraint Graph , 2011, J. Bioinform. Comput. Biol..

[12]  P. Argos,et al.  Seventy‐five percent accuracy in protein secondary structure prediction , 1997, Proteins.

[13]  Jing He,et al.  IDENTIFICATION OF α-HELICES FROM LOW RESOLUTION PROTEIN DENSITY MAPS , 2006 .

[14]  Liam J. McGuffin,et al.  Protein structure prediction servers at University College London , 2005, Nucleic Acids Res..

[15]  Aoife McLysaght,et al.  Porter: a new, accurate server for protein secondary structure prediction , 2005, Bioinform..

[16]  Yonggang Lu,et al.  Deriving Topology and Sequence Alignment for the Helix Skeleton in Low-Resolution protein Density Maps , 2008, J. Bioinform. Comput. Biol..

[17]  Jing He,et al.  Native secondary structure topology has near minimum contact energy among all possible geometrically constrained topologies , 2009, Proteins.

[18]  M. Baker,et al.  Identification of secondary structure elements in intermediate-resolution density maps. , 2007, Structure.

[19]  W. Chiu Electron microscopy of frozen, hydrated biological specimens. , 1986, Annual review of biophysics and biophysical chemistry.

[20]  Jaap Heringa,et al.  The influence of gapped positions in multiple sequence alignments on secondary structure prediction methods , 2004, Comput. Biol. Chem..

[21]  Thomas W. Reps,et al.  An Incremental Algorithm for a Generalization of the Shortest-Path Problem , 1996, J. Algorithms.

[22]  Pierre Baldi,et al.  Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles , 2002, Proteins.

[23]  M. Baker,et al.  Modeling protein structure at near atomic resolutions with Gorgon. , 2011, Journal of structural biology.

[24]  M. Baker,et al.  Bridging the information gap: computational tools for intermediate resolution structure interpretation. , 2001, Journal of molecular biology.