Experimental and Chemoinformatics Study of Tautomerism in a Database of Commercially Available Screening Samples

We investigated how many cases of the same chemical sold as different products (at possibly different prices) occurred in a prototypical large aggregated database and simultaneously tested the tautomerism definitions in the chemoinformatics toolkit CACTVS. We applied the standard CACTVS tautomeric transforms plus a set of recently developed ring-chain transforms to the Aldrich Market Select (AMS) database of 6 million screening samples and building blocks. In 30 000 cases, two or more AMS products were found to be just different tautomeric forms of the same compound. We purchased and analyzed 166 such tautomer pairs and triplets by 1H and 13C NMR to determine whether the CACTVS transforms accurately predicted what is the same "stuff in the bottle". Essentially all prototropic transforms with examples in the AMS were confirmed. Some of the ring-chain transforms were found to be too "aggressive", i.e. to equate structures with one another that were different compounds.

[1]  M C Nicklaus,et al.  Internet resources integrating many small-molecule databases1 , 2008, SAR and QSAR in environmental research.

[2]  Wolf-Dietrich Ihlenfeldt,et al.  Computation and management of chemical properties in CACTVS: An extensible networked approach toward modularity and compatibility , 1994, J. Chem. Inf. Comput. Sci..

[3]  Sandra L. Nelson,et al.  The Effect of Room-Temperature Storage on the Stability of Compounds in DMSO , 2003, Journal of biomolecular screening.

[4]  Erich Kleinpeter,et al.  NMR Spectroscopic Study of Tautomerism in Solution and in the Solid State , 2013 .

[5]  Stephen R. Heller,et al.  InChI - the worldwide chemical structure identifier standard , 2013, Journal of Cheminformatics.

[6]  Roger A. Sayle,et al.  So you think you understand tautomerism? , 2010, J. Comput. Aided Mol. Des..

[7]  Sergei V. Trepalin,et al.  Advanced Exact Structure Searching in Large Databases of Chemical Compounds , 2003, J. Chem. Inf. Comput. Sci..

[8]  Marc C Nicklaus,et al.  Tautomerism of Warfarin: Combined Chemoinformatics, Quantum Chemical, and NMR Investigation. , 2015, The Journal of organic chemistry.

[9]  Wolf-Dietrich Ihlenfeldt,et al.  Tautomerism in large databases , 2010, J. Comput. Aided Mol. Des..

[10]  Marc C. Nicklaus,et al.  Enumeration of Ring–Chain Tautomers Based on SMIRKS Rules , 2014, J. Chem. Inf. Model..

[11]  Wendy A. Warr,et al.  Tautomerism in chemical information management systems , 2010, J. Comput. Aided Mol. Des..

[12]  J. Polański Chemoinformatics , 2004 .

[13]  J. Baldwin,et al.  Rules for ring closure. , 1977, Ciba Foundation symposium.

[14]  William L. Jorgensen,et al.  Journal of Chemical Information and Modeling , 2005, J. Chem. Inf. Model..

[15]  Arthur Dalby,et al.  Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited , 1992, J. Chem. Inf. Comput. Sci..

[16]  Christopher P Austin,et al.  Monitoring Compound Integrity With Cytochrome P450 Assays and qHTS , 2009, Journal of biomolecular screening.

[17]  Thomas Engel Representation of Chemical Compounds , 2004 .

[18]  Kim D Janda,et al.  Pharmacophore reassignment for induction of the immunosurveillance cytokine TRAIL. , 2014, Angewandte Chemie.

[19]  Brenda M. Rimmer Chemical Abstracts Service (CAS) , 1988 .

[20]  Stephen R. Heller,et al.  InChI, the IUPAC International Chemical Identifier , 2015, Journal of Cheminformatics.

[21]  Marc C. Nicklaus,et al.  6.09 – Chemoinformatics , 2014 .

[22]  Manuel C. Peitsch,et al.  Building an R&D chemical registration system , 2012, Journal of Cheminformatics.

[23]  Paul M. Selzer,et al.  The Impact of Tautomer Forms on Pharmacophore-Based Virtual Screening , 2006, J. Chem. Inf. Model..

[24]  Irving Langmuir,et al.  THE ARRANGEMENT OF ELECTRONS IN ATOMS AND MOLECULES. , 1919 .

[25]  Gerd Folkers,et al.  Tautomerism in Computer‐Aided Drug Design , 2003, Journal of receptor and signal transduction research.

[26]  Ralph Kühne,et al.  Tautomer Identification and Tautomer Structure Generation Based on the InChI Code , 2010, J. Chem. Inf. Model..

[27]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..