Kernel Methods for Chemical Compounds: From Classification to Design

In this paper, we briefly review kernel methods for analysis of chemical compounds with focusing on the authors' works. We begin with a brief review of existing kernel functions that are used for classification of chemical compounds and prediction of their activities. Then, we focus on the pre-image problem for chemical compounds, which is to infer a chemical structure that is mapped to a given feature vector, and has a potential application to design of novel chemical compounds. In particular, we consider the pre-image problem for feature vectors consisting of frequencies of labeled paths of length at most K. We present several time complexity results that include: NP-hardness result for a general case, polynomial time algorithm for tree structured compounds with fixed K, and polynomial time algorithm for K=1 based on graph detachment. Then we review practical algorithms for the pre-image problem, which are based on enumeration of chemical structures satisfying given constraints. We also briefly review related results which include efficient enumeration of stereoisomers of tree-like chemical compounds and efficient enumeration of outerplanar graphs.

[1]  Hiroshi Nagamochi,et al.  A Detachment Algorithm for Inferring a Graph from Path Frequency , 2009, Algorithmica.

[2]  Lemont B. Kier,et al.  Design of molecules from quantitative structure-activity relationship models. 1. Information transfer between path and vertex degree counts , 1993, J. Chem. Inf. Comput. Sci..

[3]  Klaus Obermayer,et al.  A Maximum Common Subgraph Kernel Method for Predicting the Chromosome Aberration Test , 2010, J. Chem. Inf. Model..

[4]  Igor I. Baskin,et al.  Inverse problem in QSAR/QSPR studies for the case of topological indexes characterizing molecular shape (Kier indices) , 1993, J. Chem. Inf. Comput. Sci..

[5]  Hiroshi Nagamochi,et al.  Efficient Enumeration of Stereoisomers of Outerplanar Chemical Graphs Using Dynamic Programming , 2011, J. Chem. Inf. Model..

[6]  Tatsuya Akutsu,et al.  Inferring a Chemical Structure from a Feature Vector Based on Frequency of Labeled Paths and Small Fragments , 2007, APBC.

[7]  Hiroshi Nagamochi,et al.  Enumerating Treelike Chemical Graphs with Given Path Frequency , 2008, J. Chem. Inf. Model..

[8]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[9]  T. Akutsu,et al.  Compound analysis via graph kernels incorporating chirality. , 2010, Journal of bioinformatics and computational biology.

[10]  James G. Nourse,et al.  Applications of artificial intelligence for chemical inference. 28. The configuration symmetry group and its application to stereoisomer generation, specification, and enumeration , 1979 .

[11]  Jean-Loup Faulon,et al.  The signature molecular descriptor. 3. Inverse-quantitative structure-activity relationship of ICAM-1 inhibitory peptides. , 2003, Journal of molecular graphics & modelling.

[12]  Arthur Cayley,et al.  The Collected Mathematical Papers: On the mathematical theory of isomers , 2009 .

[13]  Alexander Zien,et al.  Learning to Find Graph Pre-images , 2004, DAGM-Symposium.

[14]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[15]  Andreas Bender,et al.  Handbook of Chemoinformatics Algorithms , 2010 .

[16]  Jean-Loup Faulon,et al.  The Signature Molecular Descriptor. 2. Enumerating Molecules from Their Extended Valence Sequences , 2003, J. Chem. Inf. Comput. Sci..

[17]  Tatsuya Akutsu,et al.  Inferring a Graph from Path Frequency , 2005, CPM.

[18]  Jean-Philippe Vert,et al.  Graph kernels based on tree patterns for molecules , 2006, Machine Learning.

[19]  G. Pólya Kombinatorische Anzahlbestimmungen für Gruppen, Graphen und chemische Verbindungen , 1937 .

[20]  Bernhard Schölkopf,et al.  Learning to Find Pre-Images , 2003, NIPS.

[21]  Hiroshi Nagamochi,et al.  Branch-and-Bound Algorithms for Enumerating Treelike Chemical Graphs with Given Path Frequency Using Detachment-Cut , 2010, J. Chem. Inf. Model..

[22]  Tatsuya Akutsu,et al.  Graph Kernels for Molecular Structure-Activity Relationship Analysis with Support Vector Machines , 2005, J. Chem. Inf. Model..

[23]  Jens Sadowski,et al.  Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification , 2003, J. Chem. Inf. Comput. Sci..

[24]  Shin-Ichi Nakano,et al.  Generating Colored Trees , 2005, WG.

[25]  Hiroshi Nagamochi,et al.  Efficient enumeration of stereoisomers of tree structured molecules using dynamic programming , 2011 .