Similarity recognition of molecular structures by optimal atomic matching and rotational superposition

An algorithm for similarity recognition of molecules and molecular clusters is presented which also establishes the optimum matching among atoms of different structures. In the first step of the algorithm, a set of molecules are coarsely superimposed by transforming them into a common reference coordinate system. The optimum atomic matching among structures is then found with the help of the Hungarian algorithm. For this, pairs of structures are represented as complete bipartite graphs with a weight function that uses intermolecular atomic distances. In the final step, a rotational superposition method is applied using the optimum atomic matching found. This yields the minimum root mean square deviation of intermolecular atomic distances with respect to arbitrary rotation and translation of the molecules. Combined with an effective similarity prescreening method, our algorithm shows robustness and an effective quadratic scaling of computational time with the number of atoms. © 2011 Wiley Periodicals, Inc. J Comput Chem, 2011

[1]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[2]  D. Theobald short communications Acta Crystallographica Section A Foundations of , 2005 .

[3]  J. Doye,et al.  Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containing up to 110 Atoms , 1997, cond-mat/9803344.

[4]  Robert P Sheridan,et al.  Why do we need so many chemical similarity search methods? , 2002, Drug discovery today.

[5]  Michael W. Mahoney,et al.  A five-site model for liquid water and the reproduction of the density anomaly by rigid, nonpolarizable potential functions , 2000 .

[6]  Janet M. Thornton,et al.  An algorithm for constraint-based structural template matching: application to 3D templates with statistical analysis , 2003, Bioinform..

[7]  R. Nussinov,et al.  Three‐dimensional, sequence order‐independent structural comparison of a serine protease against the crystallographic database reveals active site similarities: Potential implications to evolution and to protein folding , 1994, Protein science : a publication of the Protein Society.

[8]  Lydia E. Kavraki,et al.  Algorithms for Structural Comparison and Statistical Analysis of 3D Protein Motifs , 2004, Pacific Symposium on Biocomputing.

[9]  Dieter Jungnickel,et al.  Graphs, Networks, and Algorithms , 1980 .

[10]  Peter Willett,et al.  Development of an atom mapping procedure for similarity searching in databases of three‐dimensional chemical structures , 1991 .

[11]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[12]  J. Thornton,et al.  Tess: A geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites , 1997, Protein science : a publication of the Protein Society.

[13]  Riccardo Ferrando,et al.  Searching for the optimum structures of alloy nanoclusters. , 2008, Physical chemistry chemical physics : PCCP.

[14]  Conrad C. Huang,et al.  MINRMS: an efficient algorithm for determining protein structure similarity using root-mean-squared-distance , 2003, Bioinform..

[15]  R. Diamond A note on the rotational superposition problem , 1988 .

[16]  Peter Willett,et al.  Techniques for the calculation of three-dimensional structural similarity using inter-atomic distances , 1991, J. Comput. Aided Mol. Des..

[17]  Stefan Goedecker,et al.  The performance of minima hopping and evolutionary algorithms for cluster structure prediction. , 2009, The Journal of chemical physics.

[18]  G J Kleywegt,et al.  Recognition of spatial motifs in protein structures. , 1999, Journal of molecular biology.

[19]  H. W. Kuhn,et al.  Variants of the hungarian method for assignment problems , 1956 .

[20]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[21]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[22]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[23]  David J. Wales,et al.  Global minima for water clusters (H2O)n, n ⩽ 21, described by a five-site empirical potential , 2005 .

[24]  A. Anderson The process of structure-based drug design. , 2003, Chemistry & biology.

[25]  M. Sierka,et al.  Structural diversity and flexibility of MgO gas-phase clusters. , 2011, Angewandte Chemie.

[26]  Marek Sierka,et al.  Synergy between theory and experiment in structure resolution of low-dimensional oxides , 2010 .

[27]  S. Kearsley On the orthogonal transformation used for structural comparisons , 1989 .

[28]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[29]  P. Willett,et al.  A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures. , 1994, Journal of molecular biology.

[30]  Allegra Via,et al.  FunClust: a web server for the identification of structural motifs in a set of non-homologous protein structures , 2008, BMC Bioinformatics.