Maxent-Stress Optimization of 3D Biomolecular Models

Knowing a biomolecule's structure is inherently linked to and a prerequisite for any detailed understanding of its function. Significant effort has gone into developing technologies for structural characterization. These technologies do not directly provide 3D structures; instead they typically yield noisy and erroneous distance information between specific entities such as atoms or residues, which have to be translated into consistent 3D models. Here we present an approach for this translation process based on maxent-stress optimization. Our new approach extends the original graph drawing method for the new application's specifics by introducing additional constraints and confidence values as well as algorithmic components. Extensive experiments demonstrate that our approach infers structural models (i. e., sensible 3D coordinates for the molecule's atoms) that correspond well to the distance information, can handle noisy and error-prone data, and is considerably faster than established tools. Our results promise to allow domain scientists nearly-interactive structural modeling based on distance constraints.

[1]  Peter Eades,et al.  A Heuristic for Graph Drawing , 1984 .

[2]  A. Schug,et al.  Reproducible protein folding with the stochastic tunneling method. , 2003, Physical review letters.

[3]  Martin Weigt,et al.  Structural basis of histidine kinase autophosphorylation deduced by integrating genomics, molecular dynamics, and mutagenesis , 2012, Proceedings of the National Academy of Sciences.

[4]  Alexander Schug,et al.  From protein folding to protein function and biomolecular binding by energy landscape theory. , 2010, Current opinion in pharmacology.

[5]  Alexander Wolff,et al.  Faster Force-Directed Graph Drawing with the Well-Separated Pair Decomposition , 2016, Algorithms.

[6]  Le Thi Hoai An Solving Large Scale Molecular Distance Geometry Problems by a Smoothing Technique via the Gaussian Transform and D.C. Programming , 2003, J. Glob. Optim..

[7]  Qunfeng Dong,et al.  A linear-time algorithm for solving the molecular distance geometry problem with exact inter-atomic distances , 2002, J. Glob. Optim..

[8]  Martin Weigt,et al.  Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis , 2017, Proceedings of the National Academy of Sciences.

[9]  José N Onuchic,et al.  The shadow map: a general contact definition for capturing the dynamics of biomolecular folding and function. , 2012, The journal of physical chemistry. B.

[10]  Oliver F. Lange,et al.  Recognition Dynamics Up to Microseconds Revealed from an RDC-Derived Ubiquitin Ensemble in Solution , 2008, Science.

[11]  Ulrik Brandes,et al.  Eigensolver Methods for Progressive Multidimensional Scaling of Large Data , 2006, GD.

[12]  Christian Schulz,et al.  Drawing Large Graphs by Multilevel Maxent-Stress Optimization , 2015, IEEE Transactions on Visualization and Computer Graphics.

[13]  Gordon M. Crippen,et al.  Distance Geometry and Molecular Conformation , 1988 .

[14]  Leo Liberti,et al.  Distance Geometry: Theory, Methods, and Applications , 2013, Distance Geometry.

[15]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[16]  Yifan Hu,et al.  A Maxent-Stress Model for Graph Layout , 2012, IEEE Transactions on Visualization and Computer Graphics.

[17]  J. B. Stothers Carbon-13 NMR spectroscopy , 1972 .

[18]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[19]  Achi Brandt,et al.  Lean Algebraic Multigrid (LAMG): Fast Graph Laplacian Linear Solver , 2011, SIAM J. Sci. Comput..

[20]  Leo Liberti,et al.  Euclidean Distance Geometry and Applications , 2012, SIAM Rev..

[21]  Le Thi Hoai An,et al.  Large-Scale Molecular Optimization from Distance Matrices by a D.C. Optimization Approach , 2003, SIAM J. Optim..

[22]  El-Ghazali Talbi,et al.  Metaheuristics - From Design to Implementation , 2009 .

[23]  Kim-Chuan Toh,et al.  A Distributed SDP Approach for Large-Scale Noisy Anchor-Free Graph Realization with Applications to Molecular Conformation , 2008, SIAM J. Sci. Comput..

[24]  Georgios A. Pavlopoulos,et al.  Protein structure determination using metagenome sequence data , 2017, Science.

[25]  Chris Janson Biochemistry, 4th Edition , 1995, The Yale Journal of Biology and Medicine.

[26]  Michael Wegner,et al.  Estimating Current-Flow Closeness Centrality with a Multigrid Laplacian Solver , 2016, CSC.

[27]  Jorge J. Moré,et al.  Global Continuation for Distance Geometry Problems , 1995, SIAM J. Optim..

[28]  Kim-Chuan Toh,et al.  Using a Distributed SDP Approach to Solve Simulated Protein Molecular Conformation Problems , 2013, Distance Geometry.

[29]  Leo Liberti,et al.  Double variable neighbourhood search with smoothing for the molecular distance geometry problem , 2009, J. Glob. Optim..

[30]  Christian Staudt,et al.  NetworKit: A tool suite for large-scale complex network analysis , 2014, Network Science.

[31]  Simona Cocco,et al.  Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction , 2015, Nucleic acids research.

[32]  Jeffrey W. Martin,et al.  A Geometric Arrangement Algorithm for Structure Determination of Symmetric Protein Homo-Oligomers from NOEs and RDCs , 2011, J. Comput. Biol..

[33]  K. Wüthrich Protein structure determination in solution by NMR spectroscopy. , 1990, The Journal of biological chemistry.

[34]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[35]  Terence Hwa,et al.  High-resolution protein complexes from integrating genomic information with molecular simulation , 2009, Proceedings of the National Academy of Sciences.

[36]  S. Rao Kosaraju,et al.  A decomposition of multidimensional point sets with applications to k-nearest-neighbors and n-body potential fields , 1995, JACM.

[37]  Kim-Chuan Toh,et al.  An SDP-Based Divide-and-Conquer Algorithm for Large-Scale Noisy Anchor-Free Graph Realization , 2009, SIAM J. Sci. Comput..

[38]  Jorge J. Moré,et al.  Distance Geometry Optimization for Protein Structures , 1999, J. Glob. Optim..

[39]  David S. Goodsell,et al.  The RCSB Protein Data Bank: views of structural biology for basic and applied research and education , 2014, Nucleic Acids Res..