Statistics of the network of organic chemistry

Organic chemistry can be represented as a network of reactions and studied by mathematical tools of graph theory. In this paper, the structure of a network of organic reactions has been studied using several graph theory metrics. The network was based on a section of chemical space downloaded from Reaxys. The studied area of chemistry corresponds to the chemistry of terpenes and includes 12 238 931 species and 12 939 422 reactions after filtering of an initial set of 35 million reactions. The analysis of the network statistics confirmed that the network was scale-free, as was reported in the earlier literature from the analysis of a much smaller network. Many networks in other technological or non-technological areas show that nodes have a preference as to whether they connect to highly connected or scarcely connected nodes, but for chemistry no such trend was observed. It was found that the network of reactions exhibits “small world” behaviour and in simile to the ‘six degrees of separation’ encountered in social networks, on average, any molecule could be made from any other molecule in six synthesis steps. Scale-free networks have hubs in their wiring pattern. By investigating whether these hubs are not only well studied but also frequently used, it was found that they concentrated a large share of the network's load onto themselves, showing that the network's structure impacts the usage of chemistry, or vice versa, implying a hierarchy of molecules.

[1]  Daniel B. Larremore,et al.  Efficiently inferring community structure in bipartite networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[3]  Albert-László Barabási,et al.  Universality in network dynamics , 2013, Nature Physics.

[4]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[5]  Edward T. Bullmore,et al.  Modular and Hierarchically Modular Organization of Brain Networks , 2010, Front. Neurosci..

[6]  M. Newman,et al.  Why social networks are different from other types of networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Stefano Mossa,et al.  Truncation of power law behavior in "scale-free" network models due to information filtering. , 2002, Physical review letters.

[9]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.

[10]  B. Grzybowski,et al.  The core and most useful molecules in organic chemistry. , 2006, Angewandte Chemie.

[11]  O. Sporns,et al.  Complex brain networks: graph theoretical analysis of structural and functional systems , 2009, Nature Reviews Neuroscience.

[12]  Hawoong Jeong,et al.  Classification of scale-free networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[14]  Albert-László Barabási,et al.  Hierarchical organization in complex networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  E J Corey,et al.  Computer-assisted design of complex organic syntheses. , 1969, Science.

[16]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[17]  Albert Y. Zomaya,et al.  Assortative mixing in directed biological networks , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Gorka Zamora-López,et al.  Cortical Hubs Form a Module for Multisensory Integration on Top of the Hierarchy of Cortical Networks , 2009, Front. Neuroinform..

[19]  Stanley Milgram,et al.  An Experimental Study of the Small World Problem , 1969 .

[20]  S. Ahnert,et al.  A Community Under Attack: Protestant Letter Networks in the Reign of Mary I , 2014, Leonardo.

[21]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[22]  Guido Caldarelli,et al.  Scale-Free Networks , 2007 .

[23]  S. Strogatz Exploring complex networks , 2001, Nature.

[24]  A. Vázquez Growing network with local rules: preferential attachment, clustering hierarchy, and degree correlations. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[26]  Piotr Dittwald,et al.  Computer-Assisted Synthetic Planning: The End of the Beginning. , 2016, Angewandte Chemie.

[27]  Tao Zhou,et al.  Maximal planar networks with large clustering coefficient and power-law degree distribution. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  R. Pastor-Satorras,et al.  Generation of uncorrelated random scale-free networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  O Mason,et al.  Graph theory and networks in Biology. , 2006, IET systems biology.

[30]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[31]  R Pastor-Satorras,et al.  Dynamical and correlation properties of the internet. , 2001, Physical review letters.

[32]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[33]  Olaf Sporns,et al.  Complex network measures of brain connectivity: Uses and interpretations , 2010, NeuroImage.

[34]  S. Ahnert,et al.  Protestant Letter Networks in the Reign of Mary I: A Quantitative Approach , 2015 .

[35]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[36]  Marián Boguñá,et al.  Tuning clustering in random networks with arbitrary degree distributions. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[37]  Bilge Baytekin,et al.  Estimating chemical reactivity and cross-influence from collective chemical knowledge , 2012 .

[38]  D. Plenz,et al.  powerlaw: A Python Package for Analysis of Heavy-Tailed Distributions , 2013, PloS one.

[39]  H E Stanley,et al.  Classes of small-world networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Angelo Bifone,et al.  Hierarchical organization of functional connectivity in the mouse brain: a complex network approach , 2016, Scientific Reports.

[41]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[42]  Alexei Lapkin,et al.  Towards automation of chemical process route selection based on data mining , 2017 .

[43]  N. Birbaumer,et al.  The Influence of Psychological State and Motivation on Brain–Computer Interface Performance in Patients with Amyotrophic Lateral Sclerosis – a Longitudinal Study , 2010, Front. Neuropharma..

[44]  M. Fiałkowski,et al.  Architecture and evolution of organic chemistry. , 2005, Angewandte Chemie.

[45]  Roger Guimerà,et al.  Extracting the hierarchical organization of complex systems , 2007, Proceedings of the National Academy of Sciences.

[46]  S H Strogatz,et al.  Random graph models of social networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[47]  A A Lapkin,et al.  Automation of route identification and optimisation based on data-mining and chemical intuition. , 2017, Faraday discussions.

[48]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[50]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[51]  B. Grzybowski,et al.  The 'wired' universe of organic chemistry. , 2009, Nature chemistry.

[52]  B. Grzybowski,et al.  Parallel optimization of synthetic pathways within the network of organic chemistry. , 2012, Angewandte Chemie.

[53]  A. Vázquez,et al.  Network clustering coefficient without degree-correlation biases. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[54]  A. Vespignani,et al.  The architecture of complex weighted networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[55]  Jae Dong Noh,et al.  Exact scaling properties of a hierarchical network model. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.