A Distributed SDP Approach for Large-Scale Noisy Anchor-Free Graph Realization with Applications to Molecular Conformation

We propose a distributed algorithm for solving Euclidean metric realization problems arising from large 3-D graphs, using only noisy distance information and without any prior knowledge of the positions of any of the vertices. In our distributed algorithm, the graph is first subdivided into smaller subgraphs using intelligent clustering methods. Then a semidefinite programming relaxation and gradient search method are used to localize each subgraph. Finally, a stitching algorithm is used to find affine maps between adjacent clusters, and the positions of all points in a global coordinate system are then derived. In particular, we apply our method to the problem of finding the 3-D molecular configurations of proteins based on a limited number of given pairwise distances between atoms. The protein molecules, all with known molecular configurations, are taken from the Protein Data Bank. Our algorithm is able to reconstruct reliably and efficiently the configurations of large protein molecules from a limited number of pairwise distances corrupted by noise, without incorporating domain knowledge such as the minimum separation distance constraints derived from van der Waals interactions.

[1]  Y. Ye,et al.  A Gradient Search Method to Round the Semideflnite Programming Relaxation Solution for Ad Hoc Wireless Sensor Network Localization , 2004 .

[2]  H. Wolkowicz,et al.  Approximate and exact completion problems for Euclidean distance matrices using semidefinite programming , 2005 .

[3]  Di Wu,et al.  An updated geometric build-up algorithm for solving the molecular distance geometry problems with sparse distance data , 2003, J. Glob. Optim..

[4]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[5]  Kim-Chuan Toh,et al.  Solving semidefinite-quadratic-linear programs using SDPT3 , 2003, Math. Program..

[6]  Jorge J. Moré,et al.  E-optimal solutions to distance geometry problems via global continuation , 1995, Global Minimization of Nonconvex Energy Functions: Molecular Conformation and Protein Folding.

[7]  Henry Wolkowicz,et al.  Solving Euclidean Distance Matrix Completion Problems Via Semidefinite Programming , 1999, Comput. Optim. Appl..

[8]  F. A. Lootsma Distance Matrix Completion by Numerical Optimization , 1997 .

[9]  Renato D. C. Monteiro,et al.  A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization , 2003, Math. Program..

[10]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[11]  Russ B. Altman,et al.  Constrained Global Optimization for Estimating Molecular Structure from Atomic Distances , 2001, J. Comput. Biol..

[12]  Anthony Man-Cho So,et al.  A semidefinite programming approach to tensegrity theory and realizability of graphs , 2006, SODA '06.

[13]  Stephen P. Boyd,et al.  Linear Matrix Inequalities in Systems and Control Theory , 1994 .

[14]  Jorge J. Moré,et al.  Global Continuation for Distance Geometry Problems , 1995, SIAM J. Optim..

[15]  Holly Hui Jin,et al.  Scalable sensor localization algorithms for wireless sensor networks , 2005 .

[16]  Vladimir A. Yakubovich,et al.  Linear Matrix Inequalities in System and Control Theory (S. Boyd, L. E. Ghaoui, E. Feron, and V. Balakrishnan) , 1995, SIAM Rev..

[17]  Kim-Chuan Toh,et al.  Solving Large Scale Semidefinite Programs via an Iterative Solver on the Augmented Systems , 2003, SIAM J. Optim..

[18]  Michael A. Saunders,et al.  SpaseLoc: An Adaptive Subproblem Algorithm for Scalable Wireless Sensor Network Localization , 2006, SIAM J. Optim..

[19]  Alan George,et al.  Computer Solution of Large Sparse Positive Definite , 1981 .

[20]  Clifford Stein,et al.  Approximation Algorithms for Semidefinite Packing Problems with Applications to Maxcut and Graph Coloring , 2005, IPCO.

[21]  Qunfeng Dong,et al.  A Geometric Build-Up Algorithm for Solving the Molecular Distance Geometry Problem with Sparse Distance Data , 2003, J. Glob. Optim..

[22]  Yinyu Ye,et al.  Convergence behavior of interior-point algorithms , 1993, Math. Program..

[23]  Michael W. Trosset,et al.  Applications of Multidimensional Scaling to Molecular Conformation , 1997 .

[24]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[25]  B. Hendrickson The Molecular Problem: Determining Conformation from Pairwise Distances , 1990 .

[26]  E. Yaz Linear Matrix Inequalities In System And Control Theory , 1998, Proceedings of the IEEE.

[27]  Mona Singh,et al.  The side-chain positioning problem: a semidefinite programming formulation with new rounding schemes , 2003, PCK50.

[28]  Kilian Q. Weinberger,et al.  Learning a kernel matrix for nonlinear dimensionality reduction , 2004, ICML.

[29]  Yinyu Ye,et al.  Semidefinite programming for ad hoc wireless sensor network localization , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.

[30]  Zhijun Wu,et al.  Mathematical Modeling of Protein Structure Using Distance Geometry , 2000 .

[31]  Michael W. Trosset,et al.  Extensions of Classical Multidimensional Scaling via Variable Reduction , 2002, Comput. Stat..

[32]  Y. Ye,et al.  A Distributed Method for Solving Semidefinite Programs Arising from Ad Hoc Wireless Sensor Network Localization , 2006 .

[33]  Hongyuan Zha,et al.  Analysis of an alignment algorithm for nonlinear dimensionality reduction , 2007 .

[34]  Yinyu Ye,et al.  Semidefinite programming based algorithms for sensor network localization , 2006, TOSN.

[35]  Kim-Chuan Toh,et al.  Semidefinite Programming Approaches for Sensor Network Localization With Noisy Distance Measurements , 2006, IEEE Transactions on Automation Science and Engineering.

[36]  Timothy F. Havel,et al.  An evaluation of the combined use of nuclear magnetic resonance and distance geometry for the determination of protein conformations in solution. , 1985, Journal of molecular biology.

[37]  Jiawei Zhang,et al.  An Improved Algorithm for Approximating the Radii of Point Sets , 2003, RANDOM-APPROX.

[38]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[39]  Laurent El Ghaoui,et al.  3. Some Matrix Problems , 1994 .

[40]  Xiong Zhang,et al.  Solving Large-Scale Sparse Semidefinite Programs for Combinatorial Optimization , 1999, SIAM J. Optim..

[41]  Monique Laurent,et al.  Matrix Completion Problems , 2009, Encyclopedia of Optimization.

[42]  Daniel McDonald,et al.  Determining Protein Structure using the Distance Geometry Program APA , 1999, Comput. Chem..

[43]  Gordon M. Crippen,et al.  Distance Geometry and Molecular Conformation , 1988 .