Towards in silico identification of the human ether-a-go-go-related gene channel blockers: discriminative vs. generative classification models

HERG potassium channels have a critical role in the normal electrical activity of the heart. The blockade of hERG channels in heart cells can result in a potentially fatal disorder called long QT syndrome. HERG channels can be blocked by compounds with diverse structures belonging to several drug classes. Presented herein are generative (Generative Topographic Maps) and discriminative (Support Vector Machines) classification models to categorize the compounds in silico into active and inactive classes by using different types of descriptors. The predictive performance of discriminative and generative classification models has been compared. Here, the possibility of using Generative Topographic Maps as an approach for applicability domain analysis and to generate probability-based descriptors was demonstrated to our knowledge for the first time. Comparison of obtained results with the models developed by other teams on the same data set has been performed.

[1]  T. Nishikawa,et al.  A discriminant model constructed by the support vector machine method for HERG potassium channel inhibitors. , 2005, Bioorganic & medicinal chemistry letters.

[2]  J. van Leeuwen,et al.  Intelligent Data Engineering and Automated Learning , 2003, Lecture Notes in Computer Science.

[3]  Jürgen Bajorath,et al.  Combining Cluster Analysis, Feature Selection and Multiple Support Vector Machine Models for the Identification of Human Ether‐a‐go‐go Related Gene Channel Blocking Compounds , 2009, Chemical biology & drug design.

[4]  François Petitet,et al.  In Silico Classification of hERG Channel Blockers: a Knowledge‐Based Strategy , 2006, ChemMedChem.

[5]  Christopher M. Bishop,et al.  GTM: A Principled Alternative to the Self-Organizing Map , 1996, NIPS.

[6]  Rajarshi Guha,et al.  Chemical Informatics Functionality in R , 2007 .

[7]  Michael C Hutter,et al.  Determination of hERG channel blockers using a decision tree. , 2006, Bioorganic & medicinal chemistry.

[8]  Chih-Jen Lin,et al.  Feature Ranking Using Linear SVM , 2008, WCCI Causation and Prediction Challenge.

[9]  Geoffrey E. Hinton,et al.  Instantiating Deformable Models with a Neural Net , 1997, Comput. Vis. Image Underst..

[10]  Héléna A. Gaspar,et al.  Generative Topographic Mapping (GTM): Universal Tool for Data Visualization, Structure‐Activity Modeling and Dataset Comparison , 2012, Molecular informatics.

[11]  Hujun Yin,et al.  Nonlinear Multidimensional Data Projection and Visualisation , 2003, IDEAL.

[12]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[13]  Hongmao Sun,et al.  An Accurate and Interpretable Bayesian Classification Model for Prediction of hERG Liability , 2006, ChemMedChem.

[14]  Andreas Bender,et al.  Prospective Validation of a Comprehensive In silico hERG Model and its Applications to Commercial Compound and Drug Databases , 2010, ChemMedChem.

[15]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[16]  Gisbert Schneider,et al.  A Virtual Screening Method for Prediction of the hERG Potassium Channel Liability of Compound Libraries , 2002, Chembiochem : a European journal of chemical biology.

[17]  Dieter Jungnickel,et al.  Graphs, Networks, and Algorithms , 1980 .

[18]  I. Tetko,et al.  ISIDA - Platform for Virtual Screening Based on Fragment and Pharmacophoric Descriptors , 2008 .

[19]  C Antzelevitch,et al.  The potential for QT prolongation and proarrhythmia by non-antiarrhythmic drugs: clinical and regulatory implications. Report on a policy conference of the European Society of Cardiology. , 2000, European heart journal.

[20]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[21]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[22]  Igor I Baskin,et al.  The One‐Class Classification Approach to Data Description and to Models Applicability Domain , 2010, Molecular informatics.

[23]  A. Cavalli,et al.  QT prolongation through hERG K+ channel blockade: Current knowledge and strategies for the early prediction during drug development , 2005, Medicinal research reviews.

[24]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[25]  Tudor I. Oprea,et al.  hERG classification model based on a combination of support vector machine method and GRIND descriptors. , 2008, Molecular pharmaceutics.

[26]  Britta Nisius,et al.  Similarity-Based Classifier Using Topomers to Provide a Knowledge Base for hERG Channel Inhibition , 2009, J. Chem. Inf. Model..

[27]  Gerhard F. Ecker,et al.  Classification Models for hERG Inhibitors by Counter‐Propagation Neural Networks , 2008, Chemical biology & drug design.

[28]  Srikanta Sen,et al.  Predicting hERG activities of compounds from their 3D structures: development and evaluation of a global descriptors based QSAR model. , 2011, European journal of medicinal chemistry.

[29]  Sean Ekins,et al.  Shape signatures: new descriptors for predicting cardiotoxicity in silico. , 2008, Chemical research in toxicology.

[30]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[31]  Ovidiu Ivanciuc,et al.  Applications of Support Vector Machines in Chemistry , 2007 .

[32]  D. Horvath,et al.  ISIDA Property‐Labelled Fragment Descriptors , 2010, Molecular informatics.

[33]  Liu Xianming,et al.  A Time Petri Net Extended with Price Information , 2007 .

[34]  Gerhard F. Ecker,et al.  Similarity-based SIBAR descriptors for classification of chemically diverse hERG blockers , 2009, Molecular Diversity.

[35]  Ian T. Nabney,et al.  Data Visualization during the Early Stages of Drug Discovery , 2006, J. Chem. Inf. Model..

[36]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[37]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[38]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[39]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[40]  Marc Strickert,et al.  Target‐Driven Subspace Mapping Methods and Their Applicability Domain Estimation , 2011, Molecular informatics.

[41]  B. Fermini,et al.  The impact of drug-induced QT interval prolongation on drug discovery and development , 2003, Nature Reviews Drug Discovery.

[42]  Matthew D. Segall,et al.  Gaussian Processes for Classification: QSAR Modeling of ADMET and Target Activity , 2010, J. Chem. Inf. Model..

[43]  A. Tropsha,et al.  Beware of q2! , 2002, Journal of molecular graphics & modelling.

[44]  Y Xue,et al.  Prediction of torsade-causing potential of drugs by support vector machine approach. , 2004, Toxicological sciences : an official journal of the Society of Toxicology.

[45]  Sebastian Polak,et al.  Prediction of the hERG potassium channel inhibition potential with use of artificial neural networks , 2011, Appl. Soft Comput..

[46]  Sean Ekins,et al.  Insights for human ether-a-go-go-related gene potassium channel inhibition using recursive partitioning and Kohonen and Sammon mapping techniques. , 2006, Journal of medicinal chemistry.