Boosting for graph classification with universum

Recent years have witnessed extensive studies of graph classification due to the rapid increase in applications involving structural data and complex relationships. To support graph classification, all existing methods require that training graphs should be relevant (or belong) to the target class, but cannot integrate graphs irrelevant to the class of interest into the learning process. In this paper, we study a new universum graph classification framework which leverages additional “non-example” graphs to help improve the graph classification accuracy. We argue that although universum graphs do not belong to the target class, they may contain meaningful structure patterns to help enrich the feature space for graph representation and classification. To support universum graph classification, we propose a mathematical programming algorithm, ugBoost, which integrates discriminative subgraph selection and margin maximization into a unified framework to fully exploit the universum. Because informative subgraph exploration in a universum setting requires the search of a large space, we derive an upper bound discriminative score for each subgraph and employ a branch-and-bound scheme to prune the search space. By using the explored subgraphs, our graph classification model intends to maximize the margin between positive and negative graphs and minimize the loss on the universum graph examples simultaneously. The subgraph exploration and the learning are integrated and performed iteratively so that each can be beneficial to the other. Experimental results and comparisons on real-world dataset demonstrate the performance of our algorithm.

[1]  Chengqi Zhang,et al.  Multi-Graph Learning with Positive and Unlabeled Bags , 2014, SDM.

[2]  Gang Qian,et al.  View-Invariant Pose Recognition Using Multilinear Analysis and the Universum , 2008, ISVC.

[3]  Philip S. Yu,et al.  Under Consideration for Publication in Knowledge and Information Systems Gmlc: a Multi-label Feature Selection Framework for Graph Classification , 2011 .

[4]  H. Kashima,et al.  Kernels for graphs , 2004 .

[5]  Bernhard Schölkopf,et al.  An Analysis of Inference with the Universum , 2007, NIPS.

[6]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[7]  Charu C. Aggarwal,et al.  On Classification of Graph Streams , 2011, SDM.

[8]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[9]  Fei Wang,et al.  Semi-Supervised Classification with Universum , 2008, SDM.

[10]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[11]  Ambuj K. Singh,et al.  GraphSig: A Scalable Approach to Mining Significant Subgraphs in Large Graph Databases , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[12]  Hongliang Fei,et al.  Boosting with structure information in the functional space: an application to graph classification , 2010, KDD.

[13]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[14]  Philip S. Yu,et al.  Near-optimal Supervised Feature Selection among Frequent Subgraphs , 2009, SDM.

[15]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[16]  Vladimir Cherkassky,et al.  Gender classification of human faces using inference through contradictions , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[17]  Sebastian Nowozin,et al.  gBoost: a mathematical programming approach to graph classification and regression , 2009, Machine Learning.

[18]  Chengqi Zhang,et al.  Defragging Subgraph Features for Graph Classification , 2015, CIKM.

[19]  J. Sutherland,et al.  A comparison of methods for modeling quantitative structure-activity relationships. , 2004, Journal of medicinal chemistry.

[20]  Chengqi Zhang,et al.  Multi-instance Multi-graph Dual Embedding Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.

[21]  Wei Wang,et al.  GAIA: graph classification using evolutionary computation , 2010, SIGMOD Conference.

[22]  C. Russom,et al.  Predicting modes of toxic action from chemical structure: Acute toxicity in the fathead minnow (Pimephales promelas) , 1997 .

[23]  Wenwen Liu,et al.  Multi-view learning with Universum , 2014, Knowl. Based Syst..

[24]  Christos Faloutsos,et al.  Efficiently spotting the starting points of an epidemic in a large graph , 2013, Knowledge and Information Systems.

[25]  Philip S. Yu,et al.  Bag Constrained Structure Pattern Mining for Multi-Graph Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[26]  Chengqi Zhang,et al.  Multi-graph-view subgraph mining for graph classification , 2016, Knowledge and Information Systems.

[27]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[28]  Hongliang Fei,et al.  Structure feature selection for graph classification , 2008, CIKM '08.

[29]  Jason Weston,et al.  Inference with the Universum , 2006, ICML.

[30]  Shirui Pan,et al.  Finding the best not the most: regularized loss minimization subgraph selection for graph classification , 2015, Pattern Recognit..

[31]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[32]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[33]  Shirui Pan,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Graph Classification with Imbalanced Class Distributions and Noise ∗ , 2022 .

[34]  Kaspar Riesen,et al.  Graph Classification by Means of Lipschitz Embedding , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[35]  Xingquan Zhu Cross-Domain Semi-Supervised Learning Using Feature Formulation. , 2011, IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society.

[36]  Benoit Gaüzère,et al.  Two new graphs kernels in chemoinformatics , 2012, Pattern Recognit. Lett..

[37]  D. Luenberger Optimization by Vector Space Methods , 1968 .

[38]  Fumin Shen,et al.  {\cal U}Boost: Boosting with the Universum , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Gang Qian,et al.  Recognizing body poses using multilinear analysis and semi-supervised learning , 2009, Pattern Recognit. Lett..

[40]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[41]  Changshui Zhang,et al.  Selecting Informative Universum Sample for Semi-Supervised Learning , 2009, IJCAI.

[42]  Ting Guo,et al.  Understanding the roles of sub-graph features for graph classification: an empirical study perspective , 2013, CIKM.

[43]  Philip S. Yu,et al.  Positive and Unlabeled Learning for Graph Classification , 2011, 2011 IEEE 11th International Conference on Data Mining.

[44]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[45]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[46]  Philip S. Yu,et al.  Graph stream classification using labeled and unlabeled graphs , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[47]  S. Nash,et al.  Linear and Nonlinear Programming , 1987 .

[48]  Frans Coenen,et al.  Text Classification using Graph Mining-based Feature Extraction , 2010, SGAI Conf..

[49]  Chengqi Zhang,et al.  Graph Ensemble Boosting for Imbalanced Noisy Graph Stream Classification , 2015, IEEE Transactions on Cybernetics.

[50]  Jia Wu,et al.  CogBoost: Boosting for Fast Cost-Sensitive Graph Classification , 2015, IEEE Transactions on Knowledge and Data Engineering.

[51]  Philip S. Yu,et al.  Transfer Significant Subgraphs across Graph Databases , 2012, SDM.

[52]  Wei Wang,et al.  Graph classification based on pattern co-occurrence , 2009, CIKM.

[53]  Hong Cheng,et al.  Graph classification: a diversified discriminative feature selection approach , 2012, CIKM.

[54]  Philip S. Yu,et al.  Semi-supervised feature selection for graph classification , 2010, KDD.

[55]  Philip S. Yu,et al.  Joint Structure Feature Exploration and Regularization for Multi-Task Graph Classification , 2016, IEEE Transactions on Knowledge and Data Engineering.