History and Progress of the Generation of Structural Formulae in Chemistry and its Applications (dedicated to the memory of Ivar Ugi )

After a few remarks on the history of molecular modelling we describe certain mathematical aspects of the generation of molecular structural formulae. The focus is on the automatic generation of structural formulae for the purpose of molecular structure elucidation and the examination of molecular libraries. The aim is to give a review and to point to relevant literature. We demonstrate an application in the area of quantitative structure-property/activity relationships. Then, we give a glance on ongoing research in the generation of 3Dstructures (stereoisomers and conformers), and finally we mention two problems that should be solved in the near future, the possible use of hypergraphs, and the generation of patent libraries.

[1]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[2]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[3]  Reinhard Laue,et al.  Algorithms for group actions : homomorphism principle and orderly generation applied to graphs , 1996 .

[4]  B. Sturmfels Oriented Matroids , 1993 .

[5]  J. H. Redfield,et al.  The Theory of Group-Reduced Distributions , 1927 .

[6]  Ernst Ruch,et al.  Doppelnebenklassen als Klassenbegriff und Nomenklaturprinzip für Isomere und ihre AbzÄhlung , 1970 .

[7]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[8]  Tormod Næs,et al.  Multivariate calibration. I. Concepts and distinctions , 1984 .

[9]  Douglas J. Klein,et al.  Double cosets in chemistry and physics , 1983 .

[10]  K. Conrad,et al.  Group Actions , 2018, Cyber Litigation: The Legal Principles.

[11]  Sergey G. Molodtsov,et al.  Structure Elucidator: A Versatile Expert System for Molecular Structure Elucidation from 1D and 2D NMR Data and Molecular Fragments , 2004 .

[12]  I. W Nowell,et al.  Molecular Connectivity in Structure-Activity Analysis , 1986 .

[13]  Min Wang Canonical forms of discrete objects for databases and internet data exchange , 2006 .

[14]  Elena V. Konstantinova,et al.  Application of hypergraph theory in chemistry , 2001, Discret. Math..

[16]  Adalbert Kerber,et al.  QSPR Using MOLGEN-QSPR: The Challenge of Fluoroalkane Boiling Points , 2005, J. Chem. Inf. Model..

[17]  A. Balaban Highly discriminating distance-based topological index , 1982 .

[18]  Elena V. Konstantinova,et al.  Molecular Hypergraphs: The New Representation of Nonclassical Molecular Structures with Polycentric Delocalized Bonds , 1995, J. Chem. Inf. Comput. Sci..

[19]  M. Randic Characterization of molecular branching , 1975 .

[20]  N. Zefirov,et al.  Combinatorial models and algorithms in chemistry. The expanded Wiener number—a novel topological index , 1990 .

[21]  Adalbert Kerber,et al.  Group Actions, Double Cosets, and Homomorphisms: Unifying Concepts for the Constructive Theory of Discrete Structures , 1998 .

[22]  M. Karelson Molecular descriptors in QSAR/QSPR , 2000 .

[23]  James G. Nourse,et al.  APPLICATIONS OF ARTIFICIAL INTELLIGENCE FOR CHEMICAL INFERENCE. 29. EXHAUSTIVE GENERATION OF STEREOISOMERS FOR STRUCTURE ELUCIDATION , 1979 .

[24]  Nikolai S. Zefirov,et al.  Systematic Search for New Types of Chemical Interconversions: Mathematical Models and Some Applications , 1998, J. Chem. Inf. Comput. Sci..

[25]  Günter M. Ziegler,et al.  Oriented Matroids , 2017, Handbook of Discrete and Computational Geometry, 2nd Ed..

[26]  Joshua Lederberg,et al.  Applications of Artificial Intelligence for Organic Chemistry: The DENDRAL Project , 1980 .

[27]  Johann Gasteiger,et al.  Neural Networks for Chemists: An Introduction , 1993 .

[28]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[29]  A. Lunn,et al.  Isomerism and Configuration , 1928 .

[30]  A. Kerber,et al.  Molecules in Silico: The Generation of Structural Formulae and Its Applications , 2004 .

[31]  Gerta Ruecker,et al.  Walk Counts, Labyrinthicity, and Complexity of Acyclic and Cyclic Graphs and Molecules. , 2000 .

[32]  Alexandru T. Balaban,et al.  Topological indices based on topological distances in molecular graphs , 1983 .

[33]  L B Kier,et al.  Molecular connectivity V: connectivity series concept applied to density. , 1976, Journal of pharmaceutical sciences.

[34]  Adalbert Kerber,et al.  Constrained Generation of Molecular Graphs , 2001, Graphs and Discovery.

[36]  Rucker Walk counts, labyrinthicity, and complexity of acyclic and cyclic graphs and molecules , 2000, Journal of chemical information and computer sciences.

[37]  Adalbert Kerber,et al.  MOLGEN-CID - A Canonizer for Molecules and Graphs Accessible through the Internet , 2004, J. Chem. Inf. Model..

[38]  A. Kerber,et al.  MOLGEN+, a generator of connectivity isomers and stereoisomers for molecular structure elucidation , 1995 .

[39]  H. Wiener Structural determination of paraffin boiling points. , 1947, Journal of the American Chemical Society.

[40]  James G. Nourse,et al.  Applications of artificial intelligence for chemical inference. 28. The configuration symmetry group and its application to stereoisomer generation, specification, and enumeration , 1979 .

[41]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[42]  I. Gutman,et al.  Graph theory and molecular orbitals. XII. Acyclic polyenes , 1975 .

[43]  G. Pólya Kombinatorische Anzahlbestimmungen für Gruppen, Graphen und chemische Verbindungen , 1937 .

[44]  Nikolai S. Zefirov,et al.  Algebraic Chirality Criteria and Their Application to Chirality Classification in Rigid Molecular Systems , 1996, J. Chem. Inf. Comput. Sci..

[45]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[46]  Antony J. Williams,et al.  Structure Elucidation from 2D NMR Spectra Using the StrucEluc Expert System: Detection and Removal of Contradictions in the Data , 2004, J. Chem. Inf. Model..

[47]  Kurt Varmuza,et al.  Mass Spectral Classifiers for Supporting Systematic Structure Elucidation , 1996, J. Chem. Inf. Comput. Sci..

[48]  Adalbert Kerber,et al.  Applied finite group actions , 1999 .

[49]  G. Pólya,et al.  Combinatorial Enumeration Of Groups, Graphs, And Chemical Compounds , 1988 .

[50]  Adalbert Kerber,et al.  Discrete mathematics for combinatorial chemistry , 1998, Discrete Mathematical Chemistry.

[51]  Harry P. Schultz,et al.  Topological organic chemistry. 1. Graph theory and topological indices of alkanes , 1989, J. Chem. Inf. Comput. Sci..

[52]  Gerta Rücker,et al.  Counts of all walks as atomic and molecular descriptors , 1993, J. Chem. Inf. Comput. Sci..

[53]  N. Trinajstic,et al.  The Zagreb Indices 30 Years After , 2003 .

[54]  Harry P. Schultz,et al.  Topological organic chemistry. 7. Graph theory and molecular topological indexes of unsaturated and aromatic hydrocarbons , 1993, J. Chem. Inf. Comput. Sci..

[55]  Markus van Almsick,et al.  Efficient Algorithms To Enumerate Isomers and Diamutamers with More Than One Type of Substituent , 2000, J. Chem. Inf. Comput. Sci..

[56]  Alexandru T. Balaban Enumeration of Isomers , 1992 .

[57]  Gerta Rücker,et al.  On Walks in Molecular Graphs , 2001, J. Chem. Inf. Comput. Sci..

[58]  T. Carell,et al.  New promise in combinatorial chemistry: synthesis, characterization, and screening of small-molecule libraries in solution. , 1995, Chemistry & biology.

[59]  Thomas Grüner Strategien zur Konstruktion diskreter Strukturen , 1998 .

[60]  Jim Lawrence,et al.  Oriented matroids , 1978, J. Comb. Theory B.

[61]  A. Kerber,et al.  SIMILARITY OF MOLECULAR DESCRIPTORS : THE EQUIVALENCE OF ZAGREB INDICES AND WALK COUNTS , 2004 .

[62]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[63]  Adalbert Kerber,et al.  QSPR Using MOLGEN-QSPR: The Example of Haloalkane Boiling Points , 2004, J. Chem. Inf. Model..

[64]  Markus Meringer Mathematische Modelle für die kombinatorische Chemie und die molekulare Strukturaufklärung , 2004 .

[65]  Harry P. Schultz,et al.  Topological organic chemistry. 6. Graph theory and molecular topological indexes of cycloalkanes , 1993, J. Chem. Inf. Comput. Sci..

[66]  D. Cvetkovic,et al.  Graph theory and molecular orbitals , 1974 .

[67]  Adalbert Kerber,et al.  CASE via MS: Ranking Structure Candidates by Mass Spectra , 2006 .