Computational development for secondary structure detection from three-dimensional images of cryo-electron microscopy

COMPUTATIONAL DEVELOPMENT FOR SECONDARY STRUCTURE DETECTION FROM THREE-DIMENSIONAL IMAGES OF CRYO-ELECTRON MICROSCOPY Dong Si Old Dominion University. 2015 Director: Dr Jing He Electron cryo-microscopy (cryo-EM) as a cutting edge technology has carved a niche for itself in the study o f large-scale protein complex. Although the protein backbone of complexes cannot be derived directly from the medium resolution (5-10 A) of amino acids from three-dimensional (3D) density images, secondary structure elements (SSEs) such as alpha-helices and beta-sheets can still be detected. The accuracy of SSE detection from the volumetric protein density images is critical for ab initio backbone structure derivation in cryo-EM. So far it is challenging to detect the SSEs automatically and accurately from the density images at these resolutions. This dissertation presents four computational methods SSEtracer, SSElearner, StrandTwister and StrandRoller for solving this critical problem. An effective approach, SSEtracer, is presented to automatically identify helices and Psheets from the cryo-EM three-dimensional maps at medium resolutions. A simple mathematical model is introduced to represent the P-sheet density. The mathematical model can be used for P-strand detection from medium resolution density maps. A machine learning approach, SSElearner, has also been developed to automatically identify helices and P-sheets by using the knowledge from existing volumetric maps in the Electron Microscopy Data Bank (EMDB). The approach has been tested using simulated density maps and experimental cryo-EM maps o f EMDB. The results of SSElearner suggest that it is effective to use one cryo-EM map for learning in order to detect the SSE in another cryo-EM map o f similar quality. Major secondary structure elements such as a-helices and P-sheets can be computationally detected from cryo-EM density maps with medium resolutions of 5-10A. However, a critical piece of information for modeling atomic structures is missing, since there are no tools to detect P-strands from cryo-EM maps at medium resolutions A new method, StrandTwister, has been proposed to detect the traces o f P-strands through the analysis of twist, an intrinsic nature o f P-sheet. StrandTwister has been tested using 100 P-sheets simulated at 10A resolution and 39 P-sheets computationally detected from cryoEM density maps at 4.4-7.4A resolutions. StrandTwister appears to detect the traces o f Pstrands on major P-sheets quite accurately, particularly at the central area o f a P-sheet. p-barrel is a structure feature that is formed by multiple P-strands in a barrel shape. There is no existing method to derive the P-strands from the 3D image of P-barrel. A new method, StrandRoller, has been proposed to generate small sets o f possible P-traces from the density images at medium resolutions of 5-10A The results of StrandRoller suggest that it is possible to derive a small set o f possible P-traces from the P-barrel cryo-EM image at medium resolutions even when it is not possible to visualize the separation o f Pstrands. Copyright, 2015, by Dong Si, All Rights Reserved.

[1]  Andreas Martin,et al.  Limulus polyphemus hemocyanin: 10 A cryo-EM structure, sequence analysis, molecular modelling and rigid-body fitting reveal the interfaces between the eight hexamers. , 2007, Journal of molecular biology.

[2]  Daisuke Kihara,et al.  Computational methods for constructing protein structure models from 3D electron microscopy maps. , 2013, Journal of structural biology.

[3]  D. Baker,et al.  Refinement of protein structures into low-resolution density maps using rosetta. , 2009, Journal of molecular biology.

[4]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[5]  Mark A. Cohen,et al.  Correct structure prediction? , 1992, Nature.

[6]  Alfonso Valencia,et al.  Assessment of predictions submitted for the CASP6 comparative modeling category , 2005, Proteins.

[7]  W. Chiu,et al.  Seeing GroEL at 6 A resolution by single particle electron cryomicroscopy. , 2004, Structure.

[8]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[9]  Xing Zhang,et al.  3.3 Å Cryo-EM Structure of a Nonenveloped Virus Reveals a Priming Mechanism for Cell Entry , 2010, Cell.

[10]  Wah Chiu,et al.  Near-atomic-resolution cryo-EM for molecular virology. , 2011, Current opinion in virology.

[11]  Andrej Sali,et al.  Comparative protein structure modeling as an optimization problem , 1997 .

[12]  Jing He,et al.  Incorporating constraints from low resolution density map in ab initio structure prediction using Rosetta , 2007, 2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[13]  Yonggang Lu,et al.  Deriving Topology and Sequence Alignment for the Helix Skeleton in Low-Resolution protein Density Maps , 2008, J. Bioinform. Comput. Biol..

[14]  K. Ginalski Comparative modeling for protein structure prediction. , 2006, Current opinion in structural biology.

[15]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction—Round VII , 2007, Proteins.

[16]  T L Blundell,et al.  Insulin-like growth factor: a model for tertiary structure accounting for immunoreactivity and receptor binding. , 1978, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Z. Xiang,et al.  Advances in homology protein structure modeling. , 2006, Current protein & peptide science.

[18]  Matthew L. Baker,et al.  Backbone structure of the infectious ε15 virus capsid revealed by electron cryomicroscopy , 2008, Nature.

[19]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[20]  Daniel N. Wilson,et al.  Structures of the human and Drosophila 80S ribosome , 2013, Nature.

[21]  I. Weber,et al.  Evaluation of homology modeling of HIV Protease , 1990, Proteins.

[22]  M. Baker,et al.  Refinement of protein structures by iterative comparative modeling and CryoEM density fitting. , 2006, Journal of molecular biology.

[23]  J. Mccammon,et al.  Situs: A package for docking crystal structures into low-resolution maps from electron microscopy. , 1999, Journal of structural biology.

[24]  T. Blundell,et al.  Knowledge-based protein modeling. , 1994, Critical reviews in biochemistry and molecular biology.

[25]  Klaus Schulten,et al.  Ribosome-induced changes in elongation factor Tu conformation control GTP hydrolysis , 2009, Proceedings of the National Academy of Sciences.

[26]  M. Baker,et al.  Electron cryomicroscopy of biological machines at subnanometer resolution. , 2005, Structure.

[27]  Liguo Wang,et al.  Cryo-EM and single particles. , 2006, Physiology.

[28]  Yang Zhang Progress and challenges in protein structure prediction. , 2008, Current opinion in structural biology.

[29]  B. Rost,et al.  Critical assessment of methods of protein structure prediction—Round VIII , 2009, Proteins.

[30]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[31]  D Fischer,et al.  CAFASP‐1: Critical assessment of fully automated structure prediction methods , 1999, Proteins.

[32]  David C. Jones,et al.  Progress in protein structure prediction. , 1997, Current opinion in structural biology.

[33]  A. Sali,et al.  Comparative protein structure modeling by iterative alignment, model building and model assessment. , 2003, Nucleic acids research.

[34]  John D. Westbrook,et al.  EMDataBank.org: unified data resource for CryoEM , 2010, Nucleic Acids Res..

[35]  Richard Bonneau,et al.  Ab initio protein structure prediction: progress and prospects. , 2001, Annual review of biophysics and biomolecular structure.

[36]  F E Cohen,et al.  Evaluation of current techniques for Ab initio protein structure prediction , 1995, Proteins.

[37]  Z. Zhou,et al.  3.88 Å structure of cytoplasmic polyhedrosis virus by cryo-electron microscopy , 2008, Nature.

[38]  H. Saibil Conformational changes studied by cryo-electron microscopy , 2000, Nature Structural Biology.

[39]  C. Levinthal Are there pathways for protein folding , 1968 .

[40]  Ben M. Webb,et al.  Protein structure fitting and refinement guided by cryo-EM density. , 2008, Structure.

[41]  P. Wingfield,et al.  Visualization of a 4-helix bundle in the hepatitis B virus capsid by cryo-electron microscopy , 1997, Nature.

[42]  M. Baker,et al.  Structural characterization of components of protein assemblies by comparative modeling and electron cryo-microscopy. , 2005, Journal of structural biology.

[43]  Joachim Frank,et al.  Ribosome dynamics: insights from atomic structure modeling into cryo-electron microscopy maps. , 2006, Annual review of biophysics and biomolecular structure.

[44]  Matthew L. Baker,et al.  Ab Initio Modeling of the Herpesvirus VP26 Core Domain Assessed by CryoEM Density , 2006, PLoS Comput. Biol..

[45]  J L Sussman,et al.  Protein Data Bank (PDB): database of three-dimensional structural information of biological macromolecules. , 1998, Acta crystallographica. Section D, Biological crystallography.