ArbAlign: A Tool for Optimal Alignment of Arbitrarily Ordered Isomers Using the Kuhn-Munkres Algorithm

When assessing the similarity between two isomers whose atoms are ordered identically, one typically translates and rotates their Cartesian coordinates for best alignment and computes the pairwise root-mean-square distance (RMSD). However, if the atoms are ordered differently or the molecular axes are switched, it is necessary to find the best ordering of the atoms and check for optimal axes before calculating a meaningful pairwise RMSD. The factorial scaling of finding the best ordering by looking at all permutations is too expensive for any system with more than ten atoms. We report use of the Kuhn-Munkres matching algorithm to reduce the cost of finding the best ordering from factorial to polynomial scaling. That allows the application of this scheme to any arbitrary system efficiently. Its performance is demonstrated for a range of molecular clusters as well as rigid systems. The largely standalone tool is freely available for download and distribution under the GNU General Public License v3.0 (GNU_GPL_v3) agreement. An online implementation is also provided via a web server ( http://www.arbalign.org ) for convenient use.

[1]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[2]  Paola Gratteri,et al.  Field interaction and geometrical overlap: a new simplex and experimental design based computational procedure for superposing small ligand molecules. , 2003, Journal of medicinal chemistry.

[3]  W. Graham Richards,et al.  Ultrafast shape recognition to search compound databases for similar molecular shapes , 2007, J. Comput. Chem..

[4]  Paolo Tosco,et al.  Open3DALIGN: an open-source software aimed at unsupervised ligand alignment , 2011, J. Comput. Aided Mol. Des..

[5]  Ana L. Teixeira,et al.  Noncontiguous Atom Matching Structural Similarity Function , 2013, J. Chem. Inf. Model..

[6]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[7]  Pavel Hobza,et al.  Benchmark database on isolated small peptides containing an aromatic side chain: comparison between wave function and density functional theory methods and empirical force field. , 2008, Physical chemistry chemical physics : PCCP.

[8]  A. Peter Johnson,et al.  An algorithm for the multiple common subgraph problem , 1992, Journal of chemical information and computer sciences.

[9]  Karel Berka,et al.  Quantum Chemical Benchmark Energy and Geometry Database for Molecular Clusters and Complex Molecular Systems (www.begdb.com): A Users Manual and Examples , 2008 .

[10]  Jürgen Bajorath,et al.  Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. , 2007, Drug discovery today.

[11]  K. Dill,et al.  Using quaternions to calculate RMSD , 2004, J. Comput. Chem..

[12]  Marek Sierka,et al.  Similarity recognition of molecular structures by optimal atomic matching and rotational superposition , 2012, J. Comput. Chem..

[13]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[14]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[15]  Jordi Mestres,et al.  MIMIC: A molecular‐field matching program. Exploiting applicability of molecular similarity approaches , 1997 .

[16]  Charles F. F. Karney Quaternions in molecular modeling. , 2005, Journal of molecular graphics & modelling.

[17]  Vladimir Poroikov,et al.  Chemical Similarity Assessment through Multilevel Neighborhoods of Atoms: Definition and Comparison with the Other Descriptors , 1999, J. Chem. Inf. Comput. Sci..

[18]  William J. Allen,et al.  Implementation of the Hungarian Algorithm to Account for Ligand Symmetry and Similarity in Structure-Based Design , 2014, J. Chem. Inf. Model..

[19]  Xueguang Shao,et al.  Structural Distribution of Lennard-Jones Clusters Containing 562 to 1000 Atoms. , 2004, The journal of physical chemistry. A.

[20]  R. Glen,et al.  Molecular similarity: a key technique in molecular informatics. , 2004, Organic & biomolecular chemistry.

[21]  Arne Wagner,et al.  aRMSD: A Comprehensive Tool for Structural Analysis , 2017, J. Chem. Inf. Model..

[22]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[23]  Jorge M. C. Marques,et al.  How Different Are Two Chemical Structures? , 2010, J. Chem. Inf. Model..

[24]  B. Fan,et al.  Molecular similarity and diversity in chemoinformatics: From theory to applications , 2006, Molecular Diversity.

[25]  Mark S. Johnson,et al.  ShaEP: Molecular Overlay Based on Shape and Electrostatic Potential , 2009, J. Chem. Inf. Model..

[26]  Michel Petitjean,et al.  On the root mean square quantitative chirality and quantitative symmetry measures , 1999 .

[27]  Haiyan Jiang,et al.  An Efficient Method Based on Lattice Construction and the Genetic Algorithm for Optimization of Large Lennard-Jones Clusters. , 2004, The journal of physical chemistry. A.

[28]  Jean-Loup Faulon,et al.  Isomorphism, Automorphism Partitioning, and Canonical Labeling Can Be Solved in Polynomial-Time for Molecular Graphs , 1998, J. Chem. Inf. Comput. Sci..

[29]  G. Shields,et al.  Hydration of the sulfuric acid-methylamine complex and implications for aerosol formation. , 2014, The journal of physical chemistry. A.

[30]  Patrizia Calaminici,et al.  The discovery of unexpected isomers in sodium heptamers by Born-Oppenheimer molecular dynamics. , 2009, The Journal of chemical physics.

[31]  M. T. Barakat,et al.  Molecular structure matching by simulated annealing. I. A comparison between different cooling schedules , 1990, J. Comput. Aided Mol. Des..

[32]  Gert Thijs,et al.  Pharao: pharmacophore alignment and optimization. , 2008, Journal of molecular graphics & modelling.

[33]  J. A. Grant,et al.  A fast method of molecular shape comparison: A simple application of a Gaussian description of molecular shape , 1996, J. Comput. Chem..

[34]  J. Doye,et al.  Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containing up to 110 Atoms , 1997, cond-mat/9803344.

[35]  Takeshi Kawabata,et al.  Build-Up Algorithm for Atomic Correspondence between Chemical Structures , 2011, J. Chem. Inf. Model..

[36]  Berhane Temelso,et al.  Benchmark structures and binding energies of small water clusters with anharmonicity corrections. , 2011, The journal of physical chemistry. A.

[37]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[38]  Thierry Langer,et al.  Efficient overlay of small organic molecules using 3D pharmacophores , 2007, J. Comput. Aided Mol. Des..

[39]  D. K. Friesen,et al.  A combinatorial algorithm for calculating ligand binding , 1984 .

[40]  G. Stoltz,et al.  Permutation-invariant distance between atomic configurations. , 2015, The Journal of chemical physics.

[41]  D R Flower Rotational superposition: a review of methods. , 1999, Journal of molecular graphics & modelling.

[42]  G. Shields,et al.  Accurate predictions of water cluster formation, (H₂O)(n=2-10). , 2010, The journal of physical chemistry. A.