Combined Graph/Relational Database Management System for Calculated Chemical Reaction Pathway Data

Presently, quantum chemical calculations are widely used to generate extensive data sets for machine learning applications; however, generally, these sets only include information on equilibrium structures and some close conformers. Exploration of potential energy surfaces provides important information on ground and transition states, but analysis of such data is complicated due to the number of possible reaction pathways. Here, we present RePathDB, a database system for managing 3D structural data for both ground and transition states resulting from quantum chemical calculations. Our tool allows one to store, assemble, and analyze reaction pathway data. It combines relational database CGR DB for handling compounds and reactions as molecular graphs with a graph database architecture for pathway analysis by graph algorithms. Original condensed graph of reaction technology is used to store any chemical reaction as a single graph.

[1]  James E. Blake,et al.  CASREACT: more than a million reactions , 1990, J. Chem. Inf. Comput. Sci..

[2]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[3]  Igor I. Baskin,et al.  Structure-reactivity relationships in terms of the condensed graphs of reactions , 2014, Russian Journal of Organic Chemistry.

[4]  Timur I. Madzhidov,et al.  Structure–reactivity relationship in bimolecular elimination reactions based on the condensed graph of a reaction , 2015, Journal of Structural Chemistry.

[5]  Alexandre Varnek,et al.  Bimolecular Nucleophilic Substitution Reactions: Predictive Models for Rate Constants and Molecular Reaction Pairs Analysis , 2018, Molecular informatics.

[6]  Justin S. Smith,et al.  The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules , 2020, Scientific Data.

[7]  Paul M. Zimmerman,et al.  Methods for exploring reaction space in molecular systems , 2018 .

[8]  Andrew R. Leach,et al.  ChEMBL: towards direct deposition of bioassay data , 2018, Nucleic Acids Res..

[9]  D. Horvath,et al.  Predictive Models for Kinetic Parameters of Cycloaddition Reactions , 2018, Molecular informatics.

[10]  Satoshi Maeda,et al.  Systematic exploration of the mechanism of chemical reactions: the global reaction route mapping (GRRM) strategy using the ADDF and AFIR methods. , 2013, Physical chemistry chemical physics : PCCP.

[11]  Alexandre Varnek,et al.  Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures , 2005, J. Comput. Aided Mol. Des..

[12]  Maho Nakata,et al.  PubChemQC Project: A Large-Scale First-Principles Electronic Structure Database for Data-Driven Chemistry , 2017, J. Chem. Inf. Model..

[13]  Nicolas Lachiche,et al.  A Representation to Apply Usual Data Mining Techniques to Chemical reactions - Illustration on the Rate Constant of SN2 reactions in water , 2010, Int. J. Artif. Intell. Tools.

[14]  Nicolas Lachiche,et al.  A Representation to Apply Usual Data Mining Techniques to Chemical reactions - Illustration on the Rate Constant of SN2 reactions in water , 2011, Int. J. Artif. Intell. Tools.

[15]  Prashanth Athri,et al.  CompoundDB4j: Integrated Drug Resource of Heterogeneous Chemical Databases , 2020, Molecular informatics.

[16]  I. Bruno,et al.  Cambridge Structural Database , 2002 .

[17]  Y. Sumiya,et al.  Paths of chemical reactions and their networks: from geometry optimization to automated search and systematic analysis , 2019, Chemical Modelling.

[18]  Igor I. Baskin,et al.  Assessment of tautomer distribution using the condensed reaction graph approach , 2018, Journal of Computer-Aided Molecular Design.

[19]  Extending the applicability of the ANI deep learning molecular potential to Sulfur and Halogens. , 2020, Journal of chemical theory and computation.

[20]  Marwin H. S. Segler,et al.  Modelling Chemical Reasoning to Predict Reactions , 2016, Chemistry.

[21]  Dragos Horvath,et al.  Expert System for Predicting Reaction Conditions: The Michael Reaction Case , 2015, J. Chem. Inf. Model..

[22]  Jürgen Bajorath,et al.  Prediction of Activity Cliffs Using Condensed Graphs of Reaction Representations, Descriptor Recombination, Support Vector Machine Classification, and Support Vector Regression , 2016, J. Chem. Inf. Model..

[23]  S. Maeda,et al.  Rate Constant Matrix Contraction Method for Systematic Analysis of Reaction Path Networks , 2020, Chemistry Letters.

[24]  Alexandre Varnek,et al.  CGRtools: Python Library for Molecule, Reaction, and Condensed Graph of Reaction Processing , 2019, J. Chem. Inf. Model..

[25]  B. Grzybowski,et al.  Rewiring chemistry: algorithmic discovery and experimental validation of one-pot reactions in the network of organic chemistry. , 2012, Angewandte Chemie.

[26]  Jonathan Goodman,et al.  Computer Software Review: Reaxys , 2009, J. Chem. Inf. Model..

[27]  Nicholas B Rego,et al.  3Dmol.js: molecular visualization with WebGL , 2014, Bioinform..

[28]  B. Grzybowski,et al.  The 'wired' universe of organic chemistry. , 2009, Nature chemistry.

[29]  R. Friesner,et al.  Automated Transition State Search and Its Application to Diverse Types of Organic Reactions. , 2017, Journal of chemical theory and computation.

[30]  Markus Reiher,et al.  Exploration of Reaction Pathways and Chemical Transformation Networks. , 2018, The journal of physical chemistry. A.

[31]  John J. Irwin,et al.  ZINC 15 – Ligand Discovery for Everyone , 2015, J. Chem. Inf. Model..

[32]  Yosuke Sumiya,et al.  Implementation and performance of the artificial force induced reaction method in the GRRM17 program , 2017, J. Comput. Chem..

[33]  J. Aires-de-Sousa,et al.  Classification of chemical reactions and chemoinformatic processing of enzymatic transformations. , 2011, Methods in molecular biology.

[34]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..