Joint Structure Feature Exploration and Regularization for Multi-Task Graph Classification

Graph classification aims to learn models to classify structure data. To date, all existing graph classification methods are designed to target one single learning task and require a large number of labeled samples for learning good classification models. In reality, each real-world task may only have a limited number of labeled samples, yet multiple similar learning tasks can provide useful knowledge to benefit all tasks as a whole. In this paper, we formulate a new multi-task graph classification (MTG) problem, where multiple graph classification tasks are jointly regularized to find discriminative subgraphs shared by all tasks for learning. The niche of MTG stems from the fact that with a limited number of training samples, subgraph features selected for one single graph classification task tend to overfit the training data. By using additional tasks as evaluation sets, MTG can jointly regularize multiple tasks to explore high quality subgraph features for graph classification. To achieve this goal, we formulate an objective function which combines multiple graph classification tasks to evaluate the informativeness score of a subgraph feature. An iterative subgraph feature exploration and multi-task learning process is further proposed to incrementally select subgraph features for graph classification. Experiments on real-world multi-task graph classification datasets demonstrate significant performance gain.

[1]  Julien Mairal,et al.  Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..

[2]  Chen Fang,et al.  Multi-Task Metric Learning on Network Data , 2014, PAKDD.

[3]  H. Kashima,et al.  Kernels for graphs , 2004 .

[4]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[5]  Zhibin Hong,et al.  Tracking via Robust Multi-task Multi-view Joint Sparse Representation , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[7]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  Philip S. Yu,et al.  Near-optimal Supervised Feature Selection among Frequent Subgraphs , 2009, SDM.

[10]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[11]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[12]  Chengqi Zhang,et al.  Graph Ensemble Boosting for Imbalanced Noisy Graph Stream Classification , 2015, IEEE Transactions on Cybernetics.

[13]  Philip S. Yu,et al.  Multi-label Feature Selection for Graph Classification , 2010, 2010 IEEE International Conference on Data Mining.

[14]  Jia Wu,et al.  CogBoost: Boosting for Fast Cost-Sensitive Graph Classification , 2015, IEEE Transactions on Knowledge and Data Engineering.

[15]  Peter Bühlmann Regression shrinkage and selection via the Lasso: a retrospective (Robert Tibshirani): Comments on the presentation , 2011 .

[16]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[17]  Philip S. Yu,et al.  Graph stream classification using labeled and unlabeled graphs , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[18]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[19]  Philip S. Yu,et al.  Bag Constrained Structure Pattern Mining for Multi-Graph Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[20]  Chengqi Zhang,et al.  Multi-Graph Learning with Positive and Unlabeled Bags , 2014, SDM.

[21]  Dit-Yan Yeung,et al.  Multi-Task Boosting by Exploiting Task Relationships , 2012, ECML/PKDD.

[22]  Sebastian Nowozin,et al.  gBoost: a mathematical programming approach to graph classification and regression , 2009, Machine Learning.

[23]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[24]  Ovidiu Ivanciuc,et al.  Chemical graphs, molecular matrices and topological indices in chemoinformatics and quantitative structure-activity relationships. , 2013, Current computer-aided drug design.

[25]  Yuji Matsumoto,et al.  An Application of Boosting to Graph Classification , 2004, NIPS.

[26]  Shirui Pan,et al.  Finding the best not the most: regularized loss minimization subgraph selection for graph classification , 2015, Pattern Recognit..

[27]  Florian Yger,et al.  Learning with infinitely many features , 2012, Machine Learning.

[28]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[29]  Dit-Yan Yeung,et al.  A Convex Formulation for Learning Task Relationships in Multi-Task Learning , 2010, UAI.

[30]  Wei Wang,et al.  GAIA: graph classification using evolutionary computation , 2010, SIGMOD Conference.

[31]  Chengqi Zhang,et al.  Multi-Graph-View Learning for Complicated Object Classification , 2015, IJCAI.

[32]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[33]  Chengqi Zhang,et al.  Multi-graph-view Learning for Graph Classification , 2014, 2014 IEEE International Conference on Data Mining.

[34]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[35]  Hongliang Fei,et al.  Boosting with structure information in the functional space: an application to graph classification , 2010, KDD.

[36]  Mohammed J. Zaki,et al.  Approximate graph mining with label costs , 2013, KDD.

[37]  Carey E. Priebe,et al.  Graph Classification Using Signal-Subgraphs: Applications in Statistical Connectomics , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Shirui Pan,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Graph Classification with Imbalanced Class Distributions and Noise ∗ , 2022 .

[39]  Kaspar Riesen,et al.  Graph Classification by Means of Lipschitz Embedding , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[40]  Hongliang Fei,et al.  Structured feature selection and task relationship inference for multi-task learning , 2011, 2011 IEEE 11th International Conference on Data Mining.

[41]  Hong Cheng,et al.  Identifying bug signatures using discriminative graph mining , 2009, ISSTA.

[42]  Philip S. Yu,et al.  Transfer Significant Subgraphs across Graph Databases , 2012, SDM.

[43]  James Theiler,et al.  Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[44]  Hao Wang,et al.  Online Feature Selection with Streaming Features , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Zhihua Cai,et al.  Boosting for Multi-Graph Classification , 2015, IEEE Transactions on Cybernetics.

[46]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[47]  Ambuj K. Singh,et al.  GraphSig: A Scalable Approach to Mining Significant Subgraphs in Large Graph Databases , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[48]  Philip S. Yu,et al.  Semi-supervised feature selection for graph classification , 2010, KDD.

[49]  Philip S. Yu,et al.  Joint Structure Feature Exploration and Regularization for Multi-Task Graph Classification , 2016, IEEE Transactions on Knowledge and Data Engineering.