gBoost: a mathematical programming approach to graph classification and regression

Graph mining methods enumerate frequently appearing subgraph patterns, which can be used as features for subsequent classification or regression. However, frequent patterns are not necessarily informative for the given learning problem. We propose a mathematical programming boosting method (gBoost) that progressively collects informative patterns. Compared to AdaBoost, gBoost can build the prediction rule with fewer iterations. To apply the boosting method to graph data, a branch-and-bound pattern search algorithm is developed based on the DFS code tree. The constructed search space is reused in later iterations to minimize the computation time. Our method can learn more efficiently than the simpler method based on frequent substructure mining, because the output labels are used as an extra information source for pruning the search space. Furthermore, by engineering the mathematical program, a wide range of machine learning problems can be solved without modifying the pattern search algorithm.

[1]  D. Luenberger Optimization by Vector Space Methods , 1968 .

[2]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[3]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[6]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[7]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[8]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[9]  Pierre Hansen,et al.  Stabilized column generation , 1998, Discret. Math..

[10]  Dan Suciu,et al.  Data on the Web: From Relations to Semistructured Data and XML , 1999 .

[11]  Shinichi Morishita,et al.  Transversing itemset lattices with statistical metric pruning , 2000, PODS '00.

[12]  瀬々 潤,et al.  Traversing Itemset Lattices with Statistical Metric Pruning (小特集 「発見科学」及び一般演題) , 2000 .

[13]  W. Tong,et al.  QSAR Models Using a Large Diverse Set of Estrogens. , 2001 .

[14]  Shinichi Morishita Computing Optimal Hypotheses Efficiently for Boosting , 2002, Progress in Discovery Science.

[15]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[16]  Gunnar Rätsch,et al.  Constructing Boosting Algorithms from SVMs: An Application to One-Class Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[18]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[19]  J. Gasteiger,et al.  Chemoinformatics: A Textbook , 2003 .

[20]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[21]  H. Fang,et al.  Comparative molecular field analysis (CoMFA) model using a large diverse set of natural, synthetic and environmental chemicals for binding to the androgen receptor , 2003, SAR and QSAR in environmental research.

[22]  Peter A Flach,et al.  Proceedings of the 16th Annual Conference on Computational Learning Theory and 7th Kernel Workshop , 2003 .

[23]  Chao Yuan,et al.  A novel support vector classifier with better rejection performance , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24]  Yuji Matsumoto,et al.  An Application of Boosting to Graph Classification , 2004, NIPS.

[25]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[26]  Akihiro Inokuchi Mining generalized substructures from a set of labeled graphs , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[27]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[28]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[29]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[30]  Luc De Raedt,et al.  Data Mining and Machine Learning Techniques for the Identification of Mutagenicity Inducing Substructures and Structure Activity Relationships of Noncongeneric Compounds , 2004, J. Chem. Inf. Model..

[31]  Tatsuya Akutsu,et al.  Graph Kernels for Molecular Structure-Activity Relationship Analysis with Support Vector Machines , 2005, J. Chem. Inf. Model..

[32]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[33]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[34]  Pierre Baldi,et al.  Graph kernels for chemical informatics , 2005, Neural Networks.

[35]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[36]  Taku Kudo,et al.  Mining frequent stem patterns from unaligned RNA sequences , 2006, Bioinform..

[37]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[38]  Thomas Bäck,et al.  Substructure Mining Using Elaborate Chemical Representation , 2006, J. Chem. Inf. Model..

[39]  Jean-Philippe Vert,et al.  The Pharmacophore Kernel for Virtual Screening with Support Vector Machines , 2006, J. Chem. Inf. Model..

[40]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[41]  Thomas Gärtner,et al.  Simpler knowledge-based support vector machines , 2006, ICML.

[42]  Luc De Raedt,et al.  Don't Be Afraid of Simpler Patterns , 2006, PKDD.

[43]  T. Washio,et al.  Mining Discriminative Patterns from Graph Structured Data with Constrained Search , 2006 .

[44]  Hiroto Saigo,et al.  A Linear Programming Approach for Molecular QSAR analysis , 2006 .

[45]  Andreas Zell,et al.  Kernel Functions for Attributed Molecular Graphs – A New Similarity‐Based Approach to ADME Prediction in Classification and Regression , 2006 .