Building Proteins in a Day: Efficient 3D Molecular Structure Estimation with Electron Cryomicroscopy

Discovering the 3D atomic-resolution structure of molecules such as proteins and viruses is one of the foremost research problems in biology and medicine. Electron Cryomicroscopy (cryo-EM) is a promising vision-based technique for structure estimation which attempts to reconstruct 3D atomic structures from a large set of 2D transmission electron microscope images. This paper presents a new Bayesian framework for cryo-EM structure estimation that builds on modern stochastic optimization techniques to allow one to scale to very large datasets. We also introduce a novel Monte-Carlo technique that reduces the cost of evaluating the objective function during optimization by over five orders of magnitude. The net result is an approach capable of estimating 3D molecular structure from large-scale datasets in about a day on a single CPU workstation.

[1]  R. Henderson,et al.  Structure of the mitochondrial ATP synthase by electron cryomicroscopy , 2003, The EMBO journal.

[2]  M. Baker,et al.  Outcome of the First Electron Microscopy Validation Task Force Meeting , 2012, Structure.

[3]  Nicolas Le Roux,et al.  Topmoumoute Online Natural Gradient Algorithm , 2007, NIPS.

[4]  Nikolaus Grigorieff,et al.  FREALIGN: high-resolution refinement of single particle structures. , 2007, Journal of structural biology.

[5]  Andrew W. Fitzgibbon,et al.  A fast natural Newton method , 2010, ICML.

[6]  Mark W. Schmidt,et al.  A Stochastic Gradient Method with an Exponential Convergence Rate for Strongly-Convex Optimization with Finite Training Sets , 2012, ArXiv.

[7]  Marcus A. Brubaker,et al.  Microscopic Advances with Large-Scale Learning: Stochastic Optimization for Cryo-EM , 2015, ArXiv.

[8]  Sjors H.W. Scheres,et al.  RELION: Implementation of a Bayesian approach to cryo-EM structure determination , 2012, Journal of structural biology.

[9]  Kenji Fukumizu,et al.  Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons , 2000, Neural Computation.

[10]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[11]  Navdeep Jaitly,et al.  A Bayesian method for 3D macromolecular structure inference using class average images from single particle electron microscopy , 2010, Bioinform..

[12]  Kiriakos N. Kutulakos,et al.  A Probabilistic Theory of Occupancy and Emptiness , 2002, ECCV.

[13]  F. Sigworth A maximum-likelihood approach to single-particle image refinement. , 1998, Journal of structural biology.

[14]  Wolfgang Heidrich,et al.  Stochastic tomography and its applications in 3D imaging of mixing fluids , 2012, ACM Trans. Graph..

[15]  D. Potts,et al.  Sampling Sets and Quadrature Formulae on the Rotation Group , 2009 .

[16]  Thomas Malzbender,et al.  Fourier volume rendering , 1993, TOGS.

[17]  E. Callaway The revolution will not be crystallized: a new method sweeps through structural biology , 2015, Nature.

[18]  David J. Fleet,et al.  Building proteins in a day: Efficient 3D molecular reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  D. Agard,et al.  Electron counting and beam-induced motion correction enable near atomic resolution single particle cryoEM , 2013, Nature Methods.

[20]  N. Grigorieff,et al.  Accurate determination of local defocus and specimen tilt in electron microscopy. , 2003, Journal of structural biology.

[21]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Mark W. Schmidt,et al.  A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.

[23]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[24]  Sarel J Fleishman,et al.  Transmembrane protein structures without X-rays. , 2006, Trends in biochemical sciences.

[25]  A. Horwich,et al.  The crystal structure of the asymmetric GroEL–GroES–(ADP)7 chaperonin complex , 1997, Nature.

[26]  Wen Jiang,et al.  EMAN2: an extensible image processing suite for electron microscopy. , 2007, Journal of structural biology.

[27]  J. Keeler Understanding NMR Spectroscopy , 2005 .

[28]  Jianhua Zhao,et al.  TMaCS: a hybrid template matching and classification system for partially-automated particle selection. , 2013, Journal of structural biology.

[29]  J M Carazo,et al.  Xmipp 3.0: an improved software suite for image processing in electron microscopy. , 2013, Journal of structural biology.

[30]  J. Fuhrmann Advanced Computing In Electron Microscopy , 2016 .

[31]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[32]  J. Frank,et al.  Automated particle picking for low-contrast macromolecules in cryo-electron microscopy. , 2014, Journal of structural biology.

[33]  G. Herman,et al.  Disentangling conformational states of macromolecules in 3D-EM through likelihood optimization , 2007, Nature Methods.

[34]  Robert Fredriksson,et al.  Mapping the human membrane proteome : a majority of the human membrane proteins can be classified according to function and evolutionary origin , 2015 .

[35]  V. Lebedev,et al.  A QUADRATURE FORMULA FOR THE SPHERE OF THE 131ST ALGEBRAIC ORDER OF ACCURACY , 1999 .

[36]  B. Rupp Biomolecular Crystallography: Principles, Practice, and Application to Structural Biology , 2009 .

[37]  E. Saff,et al.  Distributing many points on a sphere , 1997 .

[38]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[39]  James Martens,et al.  Deep learning via Hessian-free optimization , 2010, ICML.

[40]  W. Lau,et al.  Subnanometre-resolution structure of the intact Thermus thermophilus H+-driven ATP synthase , 2011, Nature.

[41]  Jiang Hsieh,et al.  Computed Tomography: Principles, Design, Artifacts, and Recent Advances, Fourth Edition , 2022 .

[42]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[43]  Robert E. Kass,et al.  Importance sampling: a review , 2010 .

[44]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[46]  Boris Polyak Some methods of speeding up the convergence of iteration methods , 1964 .

[47]  David J. Kriegman,et al.  Structure and View Estimation for Tomographic Reconstruction: A Bayesian Approach , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[48]  J. Frank,et al.  Determination of signal-to-noise ratios and spectral SNRs in cryo-EM low-dose imaging of molecules. , 2009, Journal of structural biology.