Hash codes for the identification and classification of molecular structure elements

A set of algorithms is presented which establish the topological identity of atoms, bonds, molecules, and ensembles of molecules from a basic connection table. The computationally inexpensive result is a fixed‐length hash code which is suited for database applications and structure manipulation programs. The degree of differentiation between structural entities is adjusted easily for stereocenters, isotope labeling, atomic charges, and ionization locations or other properties. Special algorithms are presented which deal with problematic cases of uniform atomic environments. A number of practical applications demonstrate the usefulness of these hash codes. © 1994 by John Wiley & Sons, Inc.

[1]  James B. Hendrickson,et al.  Unique numbering and cataloging of molecular structures , 1983, J. Chem. Inf. Comput. Sci..

[2]  J. Gasteiger,et al.  Computer-assisted reaction prediction and synthesis design , 1990 .

[3]  Yoshimasa Takahashi,et al.  Algorithm development in chemistry: The detection of common three-dimensional substructures in large sets of possibly flexible molecules , 1993 .

[4]  Hans Dolhaine,et al.  A computer program for the enumeration of substitutional isomers , 1981, Comput. Chem..

[5]  Milan Randic,et al.  On molecular identification numbers , 1984, J. Chem. Inf. Comput. Sci..

[6]  H. L. Morgan The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. , 1965 .

[7]  William J. Wiswesser,et al.  The Wiswesser line-formula chemical notation , 1968 .

[8]  Milan Randic,et al.  Compact molecular codes , 1986, J. Chem. Inf. Comput. Sci..

[9]  S. Krishnan,et al.  Hash Functions for Rapid Storage and Retrieval of Chemical Structures , 1978, J. Chem. Inf. Comput. Sci..

[10]  Morton E. Munk,et al.  Computer Perception of Topological Symmetry , 1977, J. Chem. Inf. Comput. Sci..

[11]  M. Randi,et al.  Molecular ID numbers: by design , 1986 .

[12]  Giacomo Palmieri,et al.  Pseudorandom rectangular scan system with uniform frame density , 1966 .

[13]  A. Balaban,et al.  Unique description of chemical structures based on hierarchically ordered extended connectivities (HOC procedures). I. Algorithms for finding graph orbits and canonical numbering of atoms , 1985 .

[14]  Nenad Trinajstic,et al.  On Randic's molecular identification numbers , 1985, J. Chem. Inf. Comput. Sci..

[15]  Stephen Hanessian,et al.  Computer-assisted analysis and perception of stereochemical features in organic molecules using the CHIRON program , 1990, J. Chem. Inf. Comput. Sci..

[16]  G. A. Wilson,et al.  The Chemical Abstracts Service Chemical Registry System. II. Augmented Connectivity Molecular Formula , 1979, J. Chem. Inf. Comput. Sci..

[17]  Johann Gasteiger,et al.  Canonical Numbering and Constitutional Symmetry , 1977, J. Chem. Inf. Comput. Sci..

[18]  Johann Gasteiger,et al.  A Collection of Computer Methods for Synthesis Design and Reaction Prediction , 2010 .

[19]  Johann Gasteiger,et al.  Prediction of mass spectra from structural information , 1992, J. Chem. Inf. Comput. Sci..

[20]  Johann Gasteiger,et al.  Similarity concepts for the planning of organic reactions and syntheses , 1992, J. Chem. Inf. Comput. Sci..

[21]  Stephen Hanessian,et al.  The psychobiological basis of heuristic synthesis planning - man, machine and the chiron approach , 1990 .

[22]  James E. Rush Status of Notation and Topological Systems and Potential Future Trends , 1976, J. Chem. Inf. Comput. Sci..

[23]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[24]  W. Todd Wipke,et al.  Simulation and evaluation of chemical synthesis. Computer representation and manipulation of stereochemistry , 1974 .

[25]  Michael F. Lynch,et al.  Evaluation and implementation of topological codes for online compound search and registration , 1981, J. Chem. Inf. Comput. Sci..

[26]  Karlheinz Ballschmiter Chemie und Vorkommen der Halogenierten Dioxine und Furane , 1991 .

[27]  O. Owolabi An efficient graph approach to matching chemical structures , 1988, J. Chem. Inf. Comput. Sci..

[28]  Nenad Trinajstić,et al.  Computer enumeration and generation of trees and rooted trees , 1981, J. Chem. Inf. Comput. Sci..

[29]  Johann Gasteiger,et al.  The WODCA System , 1990 .

[30]  V. E. Golender,et al.  Graph potentials method and its application for chemical information processing , 1981, Journal of chemical information and computer sciences.