A Constraint Dynamic Graph Approach to Identify the Secondary Structure Topology from cryoEM Density Data in Presence of Errors

The determination of the secondary structure topology is a critical step in deriving the atomic structure from the protein density map obtained from electron cryo-microscopy technique. This step often relies on the matching of two sources of information. One source comes from the secondary structures detected from the protein density map at the medium resolution, such as 5-10 Å. The other source comes from the predicted secondary structures from the amino acid sequence. Due to the uncertainty in either source of information, a pool of possible secondary structure positions has to be sampled in order to include the true answer. A naïve way to find the native topology is to exhaustively map the pool of possible secondary structures detected in the density map with the pool of the secondary structures predicted from the sequence and search for the topology with the lowest cost. This paper studies the question that is how to reduce the computation of the mapping when the uncertainty of the secondary structure predictions is considered. We present a method that combines the concept of dynamic graph with our previous work of using constrained shortest path to identify the topology of the secondary structures. We show a reduction of about 34.55% time as comparison to the naïve way of handling the inaccuracies. To our knowledge, this is the 1st computationally effective exact algorithm to identify the optimal topology of the secondary structures when the inaccuracy of the predicted data is considered.

[1]  Aoife McLysaght,et al.  Porter: a new, accurate server for protein secondary structure prediction , 2005, Bioinform..

[2]  B. Rost,et al.  Alignments grow, secondary structure prediction improves , 2002, Proteins.

[3]  W Chiu,et al.  EMAN: semiautomated software for high-resolution single-particle reconstructions. , 1999, Journal of structural biology.

[4]  Liam J. McGuffin,et al.  Protein structure prediction servers at University College London , 2005, Nucleic Acids Res..

[5]  Jaap Heringa,et al.  The influence of gapped positions in multiple sequence alignments on secondary structure prediction methods , 2004, Comput. Biol. Chem..

[6]  Kamal Al-Nasr,et al.  Structure prediction for the helical skeletons detected from the low resolution protein density map , 2010, BMC Bioinformatics.

[7]  Desh Ranjan,et al.  Ranking Valid Topologies of the Secondary Structure Elements Using a Constraint Graph , 2011, J. Bioinform. Comput. Biol..

[8]  M. Baker,et al.  Bridging the information gap: computational tools for intermediate resolution structure interpretation. , 2001, Journal of molecular biology.

[9]  Jing He,et al.  Native secondary structure topology has near minimum contact energy among all possible geometrically constrained topologies , 2009, Proteins.

[10]  M. Baker,et al.  Identification of secondary structure elements in intermediate-resolution density maps. , 2007, Structure.

[11]  Jing He,et al.  IDENTIFICATION OF α-HELICES FROM LOW RESOLUTION PROTEIN DENSITY MAPS , 2006 .

[12]  Z. Zhou,et al.  Towards atomic resolution structural determination by single-particle cryo-electron microscopy. , 2008, Current opinion in structural biology.

[13]  Enrico Pontelli,et al.  Identification of alpha-helices from low resolution protein density maps. , 2006, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[14]  Aleksey A. Porollo,et al.  Combining prediction of secondary structure and solvent accessibility in proteins , 2005, Proteins.

[15]  P. Argos,et al.  Seventy‐five percent accuracy in protein secondary structure prediction , 1997, Proteins.

[16]  P. Stewart,et al.  EM-fold: De novo folding of alpha-helical proteins guided by intermediate-resolution electron microscopy density maps. , 2009, Structure.

[17]  Yonggang Lu,et al.  Deriving Topology and Sequence Alignment for the Helix Skeleton in Low-Resolution protein Density Maps , 2008, J. Bioinform. Comput. Biol..

[18]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..

[19]  Geoffrey J. Barton,et al.  JPred : a consensus secondary structure prediction server , 1999 .

[20]  Thomas W. Reps,et al.  An Incremental Algorithm for a Generalization of the Shortest-Path Problem , 1996, J. Algorithms.