Ligand-Based Virtual Screening using Random Walk Kernel and Empirical Filters

Abstract Drug discovery is a time-consuming and costly process. The data generated during various stages of the drug discovery is drastically increasing and it forces machine-learning scientist to implement more effective and fast methods for the utilization of data for reducing the cost and time. Molecular graphs are very expressive which allow faster implementation of the machine-learning algorithms. During the discovery phase, virtual or in silicoscreening plays a major role in optimizing the synthesis efforts and reducing the attrition rate of the new chemical entities (NCEs). In the present work, a combination of the virtual screening using walk kernel and empirical filters was tried. The model was applied to two classification problems to predict mutagenicity and toxicity on two publically-available datasets. The accuracies obtained were 67% for the PTC dataset and 87% for the MUTAG dataset. The results obtained from the combined method were found to be more accurate with less computational cost.

[1]  T. Keller,et al.  A practical view of 'druggability'. , 2006, Current opinion in chemical biology.

[2]  S. V. N. Vishwanathan,et al.  Fast Computation of Graph Kernels , 2006, NIPS.

[3]  J. Gasteiger,et al.  Chemoinformatics: A Textbook , 2003 .

[4]  H. Kashima,et al.  Kernels for graphs , 2004 .

[5]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[6]  A. Debnath,et al.  Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. , 1991, Journal of medicinal chemistry.

[7]  Alexander J. Smola,et al.  Fast Kernels for String and Tree Matching , 2002, NIPS.

[8]  M. Congreve,et al.  A 'rule of three' for fragment-based lead discovery? , 2003, Drug discovery today.

[9]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[10]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[11]  Ashwin Srinivasan,et al.  The Predictive Toxicology Challenge 2000-2001 , 2001, Bioinform..

[12]  C. Lipinski Drug-like properties and the causes of poor solubility and poor permeability. , 2000, Journal of pharmacological and toxicological methods.

[13]  John P. Overington,et al.  Probing the links between in vitro potency, ADMET and physicochemical parameters , 2011, Nature Reviews Drug Discovery.

[14]  Igor V. Tetko,et al.  Virtual Computational Chemistry Laboratory – Design and Description , 2005, J. Comput. Aided Mol. Des..

[15]  K P Soman,et al.  Walk-based Graph Kernel for Drug Discovery: A Review , 2014 .