Construction of a Statistical Evaluation Model Based on Molecular Centrality to Find Retrosynthetically Important Bonds in Organic Compounds

For the purpose of finding retrosynthetically important bonds in a molecule, a new evaluation score has been defined through a logistic regression analysis of known reactions stored in reaction databases. We conceived that reaction center bonds in reaction databases describe one of the most retrosynthetically important bonds for each product structure. The derived statistical equation consists of bond centrality and bond dissociation energy terms. The equation shows that synthetically useful bonds tend to be more central in a molecule and to be weaker bonds. Coefficients in two statistical equations derived from two different reaction data sets are quite similar to each other. From a comparison of molecular complexities and validation with 35 complicated organic compounds, the evaluation equation was proved to be useful. (© Wiley-VCH Verlag GmbH & Co. KGaA, 69451 Weinheim, Germany, 2008)

[1]  Steven H. Bertz,et al.  Rigorous mathematical approaches to strategic bonds and synthetic analysis based on conceptually simple new complexity indices , 1997 .

[2]  A. Agresti Categorical data analysis , 1993 .

[3]  E J Corey,et al.  Computer-assisted design of complex organic syntheses. , 1969, Science.

[4]  M. Lang,et al.  4-hydroxy[1-13C]benzoic acid: (Benzoic-1-13C acid, 4-hydroxy-) , 2002 .

[5]  Johann Gasteiger,et al.  Quantitative models of gas-phase proton-transfer reactions involving alcohols, ethers, and their thio analogs. Correlation analyses based on residual electronegativity and effective polarizability , 1984 .

[6]  E. Corey Centenary lecture. Computer-assisted analysis of complex synthetic problems , 1971 .

[7]  Johann Gasteiger,et al.  Similarity concepts for the planning of organic reactions and syntheses , 1992, J. Chem. Inf. Comput. Sci..

[8]  Johann Gasteiger,et al.  Models for the representation of knowledge about chemical reactions , 1990, J. Chem. Inf. Comput. Sci..

[9]  Gerta Rücker,et al.  Walk Counts, Labyrinthicity, and Complexity of Acyclic and Cyclic Graphs and Molecules , 2000, J. Chem. Inf. Comput. Sci..

[10]  Matthew H Todd,et al.  Computer-aided organic synthesis. , 2005, Chemical Society reviews.

[11]  Johann Gasteiger,et al.  Analysis of the reactivity of single bonds in aliphatic molecules by statistical and pattern recognition methods , 1993, J. Chem. Inf. Comput. Sci..

[12]  J. Gasteiger,et al.  ITERATIVE PARTIAL EQUALIZATION OF ORBITAL ELECTRONEGATIVITY – A RAPID ACCESS TO ATOMIC CHARGES , 1980 .

[13]  Charles E. McCulloch,et al.  Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models , 2005 .

[14]  Johann Gasteiger,et al.  A Collection of Computer Methods for Synthesis Design and Reaction Prediction , 2010 .

[15]  G. A. Petersson,et al.  General methods of synthetic analysis. Strategic bond disconnections for bridged polycyclic structures , 1975 .

[16]  R. Barone,et al.  Information theory description of synthetic strategies. A new similarity index , 2003 .

[17]  Johann Gasteiger,et al.  Prediction of mass spectra from structural information , 1992, J. Chem. Inf. Comput. Sci..

[18]  H. W. Whitlock,et al.  On the Structure of Total Synthesis of Complex Natural Products , 1998 .

[19]  M. Randic,et al.  On the Concept of Molecular Complexity , 2002 .

[20]  Steven H. Bertz,et al.  The first general index of molecular complexity , 1981 .

[21]  N. Nagelkerke,et al.  A note on a general definition of the coefficient of determination , 1991 .

[22]  K. Nicolaou,et al.  The endiandric acid cascade. Electrocyclizations in organic synthesis. I. Stepwise, stereocontrolled total synthesis of endiandric acids A and B , 1982 .

[23]  Kimito Funatsu,et al.  Molecular centrality for synthetic design of convergent reactions , 2008 .

[24]  S. Manel,et al.  Comparing discriminant analysis, neural networks and logistic regression for predicting species distributions: a case study with a Himalayan river bird , 1999 .

[25]  Gerta Rücker,et al.  Organic Synthesis - Art or Science? , 2004, J. Chem. Inf. Model..

[26]  Johann Gasteiger,et al.  New empirical models of substituent polarisability and their application to stabilisation effects in positively charged species , 1983 .

[27]  E. J. Corey,et al.  Algorithm for machine perception of synthetically significant rings in complex cyclic organic structures , 1972 .

[28]  C. NicolaouK,et al.  エンジアンドル酸の流れ(cascade) 有機合成における電子環化反応 II エンジアンドル酸C―Gの段階的,立体制御全合成 , 1982 .

[29]  Ping Huang,et al.  Molecular complexity: a simplified formula adapted to individual atoms , 1987, J. Chem. Inf. Comput. Sci..

[30]  Johann Gasteiger,et al.  Computer‐Assisted Planning of Organic Syntheses: The Second Generation of Programs , 1996 .