Design and characterization of chemical space networks for different compound data sets

Chemical Space Networks (CSNs) are generated for different compound data sets on the basis of pairwise similarity relationships. Such networks are thought to complement and further extend traditional coordinate-based views of chemical space. Our proof-of-concept study focuses on CSNs based upon fingerprint similarity relationships calculated using the conventional Tanimoto similarity metric. The resulting CSNs are characterized with statistical measures from network science and compared in different ways. We show that the homophily principle, which is widely considered in the context of social networks, is a major determinant of the topology of CSNs of bioactive compounds, designed as threshold networks, typically giving rise to community structures. Many properties of CSNs are influenced by numerical features of the conventional Tanimoto similarity metric and largely dominated by the edge density of the networks, which depends on chosen similarity threshold values. However, properties of different CSNs with constant edge density can be directly compared, revealing systematic differences between CSNs generated from randomly collected or bioactive compounds.

[1]  Peter Willett,et al.  Dissimilarity-Based Algorithms for Selecting Structurally Diverse Sets of Compounds , 1999, J. Comput. Biol..

[2]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[3]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[4]  Michael P. Krein,et al.  Exploration of the topology of chemical spaces with network measures. , 2011, The journal of physical chemistry. A.

[5]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[6]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[7]  Alexander Tropsha,et al.  Using Graph Indices for the Analysis and Comparison of Chemical Datasets , 2013, Molecular informatics.

[8]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[9]  Peter Grassberger,et al.  Clustering Drives Assortativity and Community Structure in Ensembles of Networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[11]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  G. Maggiora,et al.  Molecular similarity in medicinal chemistry. , 2014, Journal of medicinal chemistry.

[13]  M. Newman,et al.  Why social networks are different from other types of networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Jürgen Bajorath,et al.  Chemical space networks: a powerful new paradigm for the description of chemical space , 2014, Journal of Computer-Aided Molecular Design.

[15]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[16]  H. Kubinyi,et al.  3D QSAR in drug design. , 2002 .

[17]  K. M. Smith,et al.  Novel software tools for chemical diversity , 1998 .

[18]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[19]  Jürgen Bajorath,et al.  Composition and Topology of Activity Cliff Clusters Formed by Bioactive Compounds , 2014, J. Chem. Inf. Model..

[20]  J. Bajorath,et al.  Structure-activity relationship anatomy by network-like similarity graphs and local structure-activity relationship indices. , 2008, Journal of medicinal chemistry.

[21]  Аna Bilinovic,et al.  Homophily in social networks , 2016 .

[22]  C. Dobson Chemical space and biology , 2004, Nature.

[23]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[24]  Naoki Tanaka,et al.  Small-World Phenomena in Chemical Library Networks: Application to Fragment-Based Drug Discovery , 2009, J. Chem. Inf. Model..

[25]  W. Guida,et al.  The art and practice of structure‐based drug design: A molecular modeling perspective , 1996, Medicinal research reviews.

[26]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[27]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..