The Fragment Network: A Chemistry Recommendation Engine Built Using a Graph Database.

The hit validation stage of a fragment-based drug discovery campaign involves probing the SAR around one or more fragment hits. This often requires a search for similar compounds in a corporate collection or from commercial suppliers. The Fragment Network is a graph database that allows a user to efficiently search chemical space around a compound of interest. The result set is chemically intuitive, naturally grouped by substitution pattern and meaningfully sorted according to the number of observations of each transformation in medicinal chemistry databases. This paper describes the algorithms used to construct and search the Fragment Network and provides examples of how it may be used in a drug discovery context.

[1]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[2]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[3]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[4]  Daniel J. Warner,et al.  Matched molecular pairs as a medicinal chemistry tool. , 2011, Journal of medicinal chemistry.

[5]  Marcel L Verdonk,et al.  Group Efficiency: A Guideline for Hits‐to‐Leads Chemistry , 2008, ChemMedChem.

[6]  Marcel L Verdonk,et al.  Identification of inhibitors of protein kinase B using fragment-based lead discovery. , 2007, Journal of medicinal chemistry.

[7]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[8]  I. Kuntz,et al.  The maximal affinity of ligands. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Nitesh V. Chawla,et al.  Modeling a Store's Product Space as a Social Network , 2009, 2009 International Conference on Advances in Social Network Analysis and Mining.

[10]  Jean M. Severin,et al.  Discovery of Potent Nonpeptide Inhibitors of Stromelysin Using SAR by NMR , 1997 .

[11]  Andrew J. Woodhead,et al.  Discovery of an allosteric mechanism for the regulation of HCV NS3 protein function , 2012, Nature chemical biology.

[12]  Roderick E. Hubbard,et al.  Design of a Fragment Library that maximally represents available chemical space , 2011, J. Comput. Aided Mol. Des..

[13]  Igor V. Tetko,et al.  BIGCHEM: Challenges and Opportunities for Big Data Analysis in Chemistry , 2016, Molecular informatics.

[14]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[15]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[16]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[17]  C. Murray,et al.  The rise of fragment-based drug discovery. , 2009, Nature chemistry.

[18]  Peter Willett,et al.  Similarity-based virtual screening using 2D fingerprints. , 2006, Drug discovery today.

[19]  Nathan Brown Scaffold Hopping in Medicinal Chemistry , 2013 .

[20]  Jameed Hussain,et al.  Computationally Efficient Algorithm to Identify Matched Molecular Pairs (MMPs) in Large Data Sets , 2010, J. Chem. Inf. Model..

[21]  Jürgen Bajorath,et al.  Chemical space networks: a powerful new paradigm for the description of chemical space , 2014, Journal of Computer-Aided Molecular Design.

[22]  Peter Ertl,et al.  The scaffold tree: an efficient navigation in the scaffold universe. , 2011, Methods in molecular biology.

[23]  Peter Ertl,et al.  Intuitive Ordering of Scaffolds and Scaffold Similarity Searching Using Scaffold Keys , 2014, J. Chem. Inf. Model..