Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps

Cryo-electron microscopy (cryo-EM) has become a leading technology for determining protein structures. Recent advances in this field have allowed for atomic resolution. However, predicting the backbone trace of a protein has remained a challenge on all but the most pristine density maps (< 2.5Å resolution). Here we introduce a deep learning model that uses a set of cascaded convolutional neural networks (CNNs) to predict Cα atoms along a protein’s backbone structure. The cascaded-CNN (C-CNN) is a novel deep learning architecture comprised of multiple CNNs, each predicting a specific aspect of a protein’s structure. This model predicts secondary structure elements (SSEs), backbone structure, and Cα atoms, combining the results of each to produce a complete prediction map. The cascaded-CNN is a semantic segmentation image classifier and was trained using thousands of simulated density maps. This method is largely automatic and only requires a recommended threshold value for each evaluated protein. A specialized tabu-search path walking algorithm was used to produce an initial backbone trace with Cα placements. A helix-refinement algorithm made further improvements to the α-helix SSEs of the backbone trace. Finally, a novel quality assessment-based combinatorial algorithm was used to effectively map Cα traces to obtain full-atom protein structures. This method was tested on 50 experimental maps between 2.6Å and 4.4Å resolution. It outperformed several state-of-the-art prediction methods including RosettaES, MAINMAST, and a Phenix based method by producing the most complete prediction models, as measured by percentage of found Cα atoms. This method accurately predicted 88.5% (mean) of the Cα atoms within 3Å of a protein’s backbone structure surpassing the 66.8% mark achieved by the leading alternate method (Phenix based fully automatic method) on the same set of density maps. The C-CNN also achieved an average RMSD of 1.23Å for all 50 experimental density maps which is similar to the Phenix based fully automatic method. This model and all code can be downloaded at https://github.com/DrDongSi/Ca-Backbone-Prediction.

[1]  Thomas C Terwilliger,et al.  A fully automatic method yielding initial models from high-resolution electron cryo-microscopy maps , 2018, Nature Methods.

[2]  N. Yan,et al.  Structures of human Nav1.7 channel in complex with auxiliary subunits and animal toxins , 2019, Science.

[3]  Collection of Continuous Rotation MicroED Data from Ion Beam-Milled Crystals of Any Size. , 2019, Structure.

[4]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[5]  Randy J. Read,et al.  Acta Crystallographica Section D Biological , 2003 .

[6]  Dong Si,et al.  Beta-sheet Detection and Representation from Medium Resolution Cryo-EM Density Maps , 2013, BCB.

[7]  David Eisenberg,et al.  Atomic resolution structures from fragmented protein crystals by the cryoEM method MicroED , 2017, Nature Methods.

[8]  Thomas C. Terwilliger,et al.  Rapid model building of α-helices in electron-density maps , 2010, Acta crystallographica. Section D, Biological crystallography.

[9]  R. MacKinnon,et al.  Cryo-EM structure of the open high-conductance Ca2+-activated K+ channel , 2016, Nature.

[10]  Shuiwang Ji,et al.  Deep convolutional neural networks for detecting secondary structures in protein density maps from cryo-electron microscopy , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[11]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[12]  Frank DiMaio,et al.  RosettaES: a sampling strategy enabling automated interpretation of difficult cryo-EM maps , 2017, Nature Methods.

[13]  M. Zorko Structural Organization of Proteins , 2009 .

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Wen Jiang,et al.  EMAN2: an extensible image processing suite for electron microscopy. , 2007, Journal of structural biology.

[17]  T. Terwilliger Rapid chain tracing of polypeptide backbones in electron-density maps , 2010, Acta crystallographica. Section D, Biological crystallography.

[18]  Cryo-EM structure of the Slo2.2 Na+-activated K+ channel , 2015, Nature.

[19]  M. Haque,et al.  Structure of Human Mitochondrial Translation Initiation Factor 3 Bound to the Small Ribosomal Subunit , 2019, iScience.

[20]  Erik Lindahl,et al.  New tools for automated high-resolution cryo-EM structure determination in RELION-3 , 2018, eLife.

[21]  Dong Si,et al.  Combining image processing and modeling to generate traces of beta-strands from cryo-EM density images of beta-barrels , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[22]  Z. Rao,et al.  Structures of Coxsackievirus A10 unveil the molecular mechanisms of receptor binding and viral uncoating , 2018, Nature Communications.

[23]  Renzhi Cao,et al.  Protein single-model quality assessment by feature-based probability density functions , 2016, Scientific Reports.

[24]  Dong Si,et al.  A machine learning approach for the identification of protein secondary structure elements from electron cryo-microscopy density maps. , 2012, Biopolymers.

[25]  Z. Rao,et al.  Structures of Coxsackievirus A10 unveil the molecular mechanisms of receptor binding and viral uncoating , 2018, Nature Communications.

[26]  Thomas C Terwilliger,et al.  Automated map sharpening by maximization of detail and connectivity , 2018, bioRxiv.

[27]  Xiao Tao,et al.  Structural basis for gating the high-conductance Ca2+-activated K+ channel , 2016, Nature.

[28]  Yigong Shi,et al.  Structure of the rabbit ryanodine receptor RyR1 at near-atomic resolution , 2014, Nature.

[29]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Albert Ng,et al.  Genetic Algorithm Based Beta-Barrel Detection for Medium Resolution Cryo-EM Density Maps , 2017, ISBRA.

[31]  Dong Si,et al.  Tracing beta strands using StrandTwister from cryo-EM density maps at medium resolutions. , 2014, Structure.

[32]  Andrew Y. Ng,et al.  Convolutional-Recursive Deep Learning for 3D Object Classification , 2012, NIPS.

[33]  M. Baker,et al.  Gating machinery of InsP3R channels revealed by electron cryomicroscopy , 2015, Nature.

[34]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[35]  Genki Terashi,et al.  De novo main-chain modeling for EM maps using MAINMAST , 2018, Nature Communications.

[36]  Astrid Graslund,et al.  Introduction to Peptides and Proteins , 2009 .

[37]  M. Lei,et al.  Structural basis of the crosstalk between histone H2B monoubiquitination and H3 lysine 79 methylation on nucleosome , 2019, Cell Research.

[38]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[39]  Jeffrey Skolnick,et al.  Fast procedure for reconstruction of full‐atom protein models from reduced representations , 2008, J. Comput. Chem..

[40]  Thomas C. Terwilliger,et al.  Electronic Reprint Biological Crystallography Automated Main-chain Model Building by Template Matching and Iterative Fragment Extension , 2022 .

[41]  S. Butcher,et al.  A 2.8-Angstrom-Resolution Cryo-Electron Microscopy Structure of Human Parechovirus 3 in Complex with Fab from a Neutralizing Antibody , 2018, Journal of Virology.

[42]  Jing He,et al.  IDENTIFICATION OF α-HELICES FROM LOW RESOLUTION PROTEIN DENSITY MAPS , 2006 .

[43]  D. Julius,et al.  Structure of the TRPV1 ion channel determined by electron cryo-microscopy , 2013, Nature.

[44]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .