Automatized Assessment of Protective Group Reactivity: A Step Toward Big Reaction Data Analysis

We report a new method to assess protective groups (PGs) reactivity as a function of reaction conditions (catalyst, solvent) using raw reaction data. It is based on an intuitive similarity principle for chemical reactions: similar reactions proceed under similar conditions. Technically, reaction similarity can be assessed using the Condensed Graph of Reaction (CGR) approach representing an ensemble of reactants and products as a single molecular graph, i.e., as a pseudomolecule for which molecular descriptors or fingerprints can be calculated. CGR-based in-house tools were used to process data for 142,111 catalytic hydrogenation reactions extracted from the Reaxys database. Our results reveal some contradictions with famous Greene's Reactivity Charts based on manual expert analysis. Models developed in this study show high accuracy (ca. 90%) for predicting optimal experimental conditions of protective group deprotection.

[1]  Dragos Horvath,et al.  Expert System for Predicting Reaction Conditions: The Michael Reaction Case , 2015, J. Chem. Inf. Model..

[2]  Igor I. Baskin,et al.  Structure-reactivity relationships in terms of the condensed graphs of reactions , 2014, Russian Journal of Organic Chemistry.

[3]  David Z. Chen,et al.  Automatic reaction mapping and reaction center detection , 2013 .

[4]  C. Adjiman,et al.  Computer-aided molecular design of solvents for accelerated reaction kinetics. , 2013, Nature chemistry.

[5]  Dragos Horvath,et al.  Models for Identification of Erroneous Atom-to-Atom Mapping of Reactions Performed by Automated Algorithms , 2012, J. Chem. Inf. Model..

[6]  Gilles Marcou,et al.  Mining Chemical Reactions Using Neighborhood Behavior and Condensed Graphs of Reactions Approaches , 2012, J. Chem. Inf. Model..

[7]  D. Horvath,et al.  ISIDA Property‐Labelled Fragment Descriptors , 2010, Molecular informatics.

[8]  Alexander Tropsha,et al.  Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research , 2010, J. Chem. Inf. Model..

[9]  Nicolas Lachiche,et al.  A Representation to Apply Usual Data Mining Techniques to Chemical reactions - Illustration on the Rate Constant of SN2 reactions in water , 2010, Int. J. Artif. Intell. Tools.

[10]  Valerie J. Gillet,et al.  Knowledge-Based Approach to de Novo Design Using Reaction Vectors , 2009, J. Chem. Inf. Model..

[11]  René Barone,et al.  Computer‐Assisted Synthesis Design (CASD) , 2008 .

[12]  P. Romea,et al.  Studies on the hydrogenolysis of benzyl ethers , 2006 .

[13]  Alexandre Varnek,et al.  Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures , 2005, J. Comput. Aided Mol. Des..

[14]  H. Sajiki,et al.  Solvent-modulated Pd/C-catalyzed deprotection of silyl ethers and chemoselective hydrogenation , 2004 .

[15]  M. Surfraz,et al.  Bis-benzyl protected 6-amino cyclitols are poisonous to Pd/C catalysed hydrogenolysis of benzyl ethers , 2004 .

[16]  M. Grøtli,et al.  Palladium on carbon encapsulated in POEPOP(1500): a resin-supported catalyst for hydrogenation reactions. , 2002, Organic letters.

[17]  H. Sajiki,et al.  A novel type of hydrogenation using a catalyst poison: Chemoselective inhibition of the hydrogenolysis for O-benzyl protective group by the addition of a nitrogen-containing base , 1998 .

[18]  J. Bindra,et al.  An efficient route to intermediates for the synthesis of 11-deoxyprostaglandins , 1978 .

[19]  J. Meienhofer,et al.  Catalytic hydrogenolysis in liquid ammonia: stability and cleavage of some protecting groups used in peptide synthesis , 1974 .

[20]  C. Heathcock,et al.  Stereoselective total synthesis of the guaiazulenic sesquiterpenoids .alpha.-bulnesene and bulnesol , 1971 .

[21]  D J Rogers,et al.  A Computer Program for Classifying Plants. , 1960, Science.