Towards automation of chemical process route selection based on data mining

A methodology for chemical routes development and evaluation on the basis of data-mining is presented. A section of the Reaxys database was converted into a network, which was used to plan hypothetical synthesis routes to convert a bio-waste feedstock, limonene, to a bulk intermediate, benzoic acid. The route evaluation considered process conditions and used multiple indicators, including exergy, E-factor, solvent score, reaction reliability and route redox efficiency, in a multi-criteria environmental sustainability evaluation. The proposed methodology is the first route evaluation based on data mining, explicitly using reaction conditions, and is amenable to full automation.

[1]  J. B. Hendrickson,et al.  Systematic synthesis design. IV. Numerical codification of construction reactions , 1975 .

[2]  J Dewulf,et al.  Exergy-based efficiency and renewability assessment of biofuel production. , 2005, Environmental science & technology.

[3]  Ubbo Visser,et al.  Fast and accurate semantic annotation of bioassays exploiting a hybrid of machine learning and user confirmation , 2014, PeerJ.

[4]  C. F. Chueh,et al.  Estimation of liquid heat capacity , 1973 .

[5]  B. Grzybowski,et al.  Parallel optimization of synthetic pathways within the network of organic chemistry. , 2012, Angewandte Chemie.

[6]  Henry S Rzepa,et al.  Enhancement of the chemical semantic web through the use of InChI identifiers. , 2005, Organic & biomolecular chemistry.

[7]  Chuanbing Tang,et al.  Progress in renewable polymers from natural terpenes, terpenoids, and rosin. , 2013, Macromolecular rapid communications.

[8]  E. Ruiz-Hitzky,et al.  Synthesis of p-cymene from limonene, a renewable feedstock , 2008 .

[9]  Surajit Biswas,et al.  Catalytic oxidation of aromatic hydrocarbons by mono-oxido-alkoxidovanadium(V) complexes of ONNO donor ethylenediamine-bis(phenolate) ligands , 2013 .

[10]  Michael G. Hutchings,et al.  Route Design in the 21st Century: The ICSYNTH Software Tool as an Idea Generator for Synthesis Prediction , 2015 .

[11]  K. Joback,et al.  ESTIMATION OF PURE-COMPONENT PROPERTIES FROM GROUP-CONTRIBUTIONS , 1987 .

[12]  James H. Clark,et al.  Towards a holistic approach to metrics for the 21st century pharmaceutical industry , 2015 .

[13]  D. Morris,et al.  Standard chemical exergy of some elements and compounds on the planet earth , 1986 .

[14]  Concepción Jiménez-González,et al.  Using the Right Green Yardstick: Why Process Mass Intensity Is Used in the Pharmaceutical Industry To Drive More Sustainable Processes , 2011 .

[15]  M. Fiałkowski,et al.  Architecture and evolution of organic chemistry. , 2005, Angewandte Chemie.

[16]  John Andraos,et al.  Global Green Chemistry Metrics Analysis Algorithm and Spreadsheets: Evaluation of the Material Efficiency Performances of Synthesis Plans for Oseltamivir Phosphate (Tamiflu) as a Test Case , 2009 .

[17]  Geoffrey P. Hammond,et al.  Exergy analysis of the United Kingdom energy system , 2001 .

[18]  Alexei Lapkin,et al.  Green chemistry metrics: measuring and monitoring sustainable processes , 2008 .

[19]  Oliver Koch,et al.  More than a rigid framework: molecular design using secondary structure element information , 2013, Journal of Cheminformatics.

[20]  B. Rice,et al.  Predicting heats of formation of energetic materials using quantum mechanical calculations , 1999 .

[21]  H. Lou,et al.  Incorporating Exergy Analysis and Inherent Safety Analysis for Sustainability Assessment of Biofuels , 2011 .

[22]  J. Paris,et al.  Exergy flows analysis in chemical reactors , 1998 .

[23]  Roger A. Sheldon,et al.  Overcoming barriers to green chemistry in the pharmaceutical industry – the Green Aspiration Level™ concept , 2015 .

[24]  Alexander J. Lawson,et al.  Multistep reactions: the RABBIT approach , 1990, J. Chem. Inf. Comput. Sci..

[25]  Paul Anastas,et al.  Green chemistry: principles and practice. , 2010, Chemical Society reviews.

[26]  Bilge Baytekin,et al.  Estimating chemical reactivity and cross-influence from collective chemical knowledge , 2012 .

[27]  J. B. Hendrickson,et al.  Systematic characterization of structures and reactions for use in organic synthesis , 1971 .

[28]  Tiago P. Peixoto,et al.  The graph-tool python library , 2014 .

[29]  Antony J. Williams,et al.  Machines first, humans second: on the importance of algorithmic interpretation of open chemistry data , 2015, Journal of Cheminformatics.

[30]  John Andraos,et al.  Complete Green Metrics Evaluation of Various Routes to Methyl Methacrylate According to Material and Energy Consumptions and Environmental and Safety Impacts: Test Case from the Chemical Industry , 2016 .

[31]  Guenter Grethe,et al.  International chemical identifier for reactions (RInChI) , 2013, Journal of Cheminformatics.

[32]  R. W. Hoffmann,et al.  Redox economy in organic synthesis. , 2009, Angewandte Chemie.

[33]  Roger A. Sheldon,et al.  The E Factor: fifteen years on , 2007 .

[34]  Ferenc Friedler,et al.  Assessment of Sustainability-Potential: Hierarchical Approach , 2005 .

[35]  John D. Hayler,et al.  CHEM21 selection guide of classical- and less classical-solvents , 2016 .

[36]  David J. C. Constable,et al.  Perspective on Solvent Use in the Pharmaceutical Industry , 2007 .

[37]  María Isabel Sosa,et al.  Physical-Chemical and Thermodynamic Analyses of Ethanol Steam Reforming for Hydrogen Production , 2006 .

[38]  Orr Ravitz,et al.  Data-driven computer aided synthesis design. , 2013, Drug discovery today. Technologies.

[39]  John Andraos,et al.  Unification of Reaction Metrics for Green Chemistry: Applications to Reaction Analysis , 2005 .

[40]  J. L. Perez-Benedito,et al.  Practical Approach to Exergy and Thermoeconomic Analyses of Industrial Processes , 2012 .

[41]  S. Kamiguchi,et al.  Catalytic ring-attachment isomerization and dealkylation of diethylbenzenes over halide clusters of group 5 and group 6 transition metals , 2004 .

[42]  A. K. Mukherjee,et al.  Novel oxo-peroxo molybdenum(VI) complexes incorporating 8-quinolinol: synthesis, structure and catalytic uses in the environmentally benign and cost-effective oxidation method of methyl benzenes: Ar(CH3)n (n = 1, 2) , 1999 .

[43]  Laihong Shen,et al.  A unified correlation for estimating specific chemical exergy of solid and liquid fuels , 2012 .

[44]  M. Yates,et al.  Sustainable p-cymene and hydrogen from limonene , 2010 .

[45]  C. Veloso,et al.  Catalytic Conversion of Terpenes into Fine Chemicals , 2004 .

[46]  Phil S. Baran,et al.  Enantiospecific total synthesis of the hapalindoles, fischerindoles, and welwitindolinones via a redox economic approach. , 2008, Journal of the American Chemical Society.

[47]  M. Meier,et al.  Terpene-Based Renewable Monomers and Polymers via Thiol–Ene Additions , 2011 .

[48]  Jo Dewulf,et al.  Exergetic life cycle analysis for the selection of chromatographic separation processes in the pharmaceutical industry: preparative HPLC versus preparative SFC , 2009 .

[49]  R. Reid,et al.  The Properties of Gases and Liquids , 1977 .

[50]  Concepción Jiménez-González,et al.  Evaluating the "greenness" of chemical processes and products in the pharmaceutical industry--a green metrics primer. , 2012, Chemical Society reviews.

[51]  J. Andraos On the Probability That Ring-Forming Multicomponent Reactions Are Intrinsically Green: Setting Thresholds for Intrinsic Greenness Based on Design Strategy and Experimental Reaction Performance , 2013 .

[52]  Jeroen Dewulf,et al.  Integral resource management by exergy analysis for the selection of a separation process in the pharmaceutical industry , 2007 .

[53]  Anthony P. F. Cook,et al.  Computer‐aided synthesis design: 40 years on , 2012 .

[54]  B. Grzybowski,et al.  The core and most useful molecules in organic chemistry. , 2006, Angewandte Chemie.

[55]  Piotr Dittwald,et al.  Computer-Assisted Synthetic Planning: The End of the Beginning. , 2016, Angewandte Chemie.