Chemical Structure Elucidation from 13C NMR Chemical Shifts: Efficient Data Processing Using Bipartite Matching and Maximal Clique Algorithms

Computer-assisted chemical structure elucidation has been intensively studied since the first use of computers in chemistry in the 1960s. Most of the existing elucidators use a structure-spectrum database to obtain clues about the correct structure. Such a structure-spectrum database is expected to grow on a daily basis. Hence, the necessity to develop an efficient structure elucidation system that can adapt to the growth of a database has been also growing. Therefore, we have developed a new elucidator using practically efficient graph algorithms, including the convex bipartite matching, weighted bipartite matching, and Bron-Kerbosch maximal clique algorithms. The utilization of the two matching algorithms especially is a novel point of our elucidator. Because of these sophisticated algorithms, the elucidator exactly produces a correct structure if all of the fragments are included in the database. Even if not all of the fragments are in the database, the elucidator proposes relevant substructures that can help chemists to identify the actual chemical structures. The elucidator, called the CAST/CNMR Structure Elucidator, plays a complementary role to the CAST/CNMR Chemical Shift Predictor, and together these two functions can be used to analyze the structures of organic compounds.

[1]  K. Funatsu,et al.  Novel Canonical Coding Method for Representation of Three‐Dimensional Structures. , 2000 .

[2]  Takeshi Kawabata,et al.  Build-Up Algorithm for Atomic Correspondence between Chemical Structures , 2011, J. Chem. Inf. Model..

[3]  W. Bremser,et al.  SpecInfo—A multidimensional spectroscopic interpretation system , 1991 .

[4]  M. Elyashberg,et al.  Expert systems as a tool for the molecular structure elucidation by spectral methods. Strategies of solution to the problems , 1997 .

[5]  Peter Willett,et al.  RASCAL: Calculation of Graph Similarity using Maximum Common Edge Subgraphs , 2002, Comput. J..

[6]  H. Koshino,et al.  CAST/CNMR: highly accurate 13C NMR chemical shift prediction system considering stereochemistry , 2003 .

[7]  S. Iwata,et al.  Effective consideration of ring structures in CAST/CNMR for highly accurate 13C NMR chemical shift prediction , 2005 .

[8]  Bruce G. Buchanan,et al.  Dendral and Meta-Dendral: Their Applications Dimension , 1978, Artif. Intell..

[9]  E. Feigenbaum,et al.  Applications of artificial intelligence for chemical inference. I. Number of possible organic compounds. Acyclic structures containing carbon, hydrogen, oxygen, and nitrogen , 1969 .

[10]  Jean-Loup Faulon,et al.  OMG: Open Molecule Generator , 2012, Journal of Cheminformatics.

[11]  Hiroko Satoh,et al.  Structural revision of terpenoids with a (3Z)-2-methyl-3-penten-2-ol moiety by the synthesis of (23E)- and (23Z)-cycloart-23-ene-3beta,25-diols. , 2007, The Journal of organic chemistry.

[12]  Takeaki Uno,et al.  Algorithm for Advanced Canonical Coding of Planar Chemical Structures That Considers Stereochemical and Symmetric Information , 2007, J. Chem. Inf. Model..

[13]  Hiroko Satoh,et al.  Structural Revision of Peribysins C and D. , 2006 .

[14]  Peter Willett,et al.  Promoting Access to White Rose Research Papers Effectiveness of Graph-based and Fingerprint-based Similarity Measures for Virtual Screening of 2d Chemical Structure Databases , 2022 .

[15]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[16]  Peter Willett,et al.  Heuristics for Similarity Searching of Chemical Graphs Using a Maximum Common Edge Subgraph Algorithm , 2002, J. Chem. Inf. Comput. Sci..

[17]  Wolfgang Bremser,et al.  Structure Elucidation and Artificial Intelligence , 1988 .

[18]  Takayuki Itoh,et al.  Reconstitution of a fungal meroterpenoid biosynthesis reveals the involvement of a novel family of terpene cyclases. , 2010, Nature chemistry.

[19]  M. Elyashberg,et al.  An expert system for automated structure elucidation utilizing 1H-1H, 13C-1H and 15N-1H 2D NMR correlations , 2001, Fresenius' journal of analytical chemistry.

[20]  Martin Will,et al.  Fully Automated Structure Elucidation - A Spectroscopist's Dream Comes True , 1996, J. Chem. Inf. Comput. Sci..

[21]  Huixiao Hong,et al.  Spec2D: A Structure Elucidation System Based on 1H NMR and H-H COSY Spectra in Organic Chemistry , 2006, J. Chem. Inf. Model..

[22]  H. Koshino,et al.  Extended CAST Coding Method for Exact Search of Stereochemical Structures , 2002 .

[23]  Kimito Funatsu,et al.  Recent Advances in the Automated Structure Elucidation System, CHEMICS. Utilization of Two-Dimensional NMR Spectral Information and Development of Peripheral Functions for Examination of Candidates , 1996, J. Chem. Inf. Comput. Sci..

[24]  Kimito Funatsu,et al.  Novel Canonical Coding Method for Representation of Three-Dimensional Structures , 2000, J. Chem. Inf. Comput. Sci..

[25]  Kimito Funatsu,et al.  Representation of Molecular Configurations by CAST Coding Method , 2001, J. Chem. Inf. Comput. Sci..

[26]  M. Elyashberg,et al.  A new approach to computer-aided molecular structure elucidation: the expert system Structure Elucidator , 1999 .