Task Sensitive Feature Exploration and Learning for Multitask Graph Classification

Multitask learning (MTL) is commonly used for jointly optimizing multiple learning tasks. To date, all existing MTL methods have been designed for tasks with feature-vector represented instances, but cannot be applied to structure data, such as graphs. More importantly, when carrying out MTL, existing methods mainly focus on exploring overall commonality or disparity between tasks for learning, but cannot explicitly capture task relationships in the feature space, so they are unable to answer important questions, such as what exactly is shared between tasks and what is the uniqueness of one task differing from others? In this paper, we formulate a new multitask graph learning problem, and propose a task sensitive feature exploration and learning algorithm for multitask graph classification. Because graphs do not have features available, we advocate a task sensitive feature exploration and learning paradigm to jointly discover discriminative subgraph features across different tasks. In addition, a feature learning process is carried out to categorize each subgraph feature into one of three categories: 1) common feature; 2) task auxiliary feature; and 3) task specific feature, indicating whether the feature is shared by all tasks, by a subset of tasks, or by only one specific task, respectively. The feature learning and the multiple task learning are iteratively optimized to form a multitask graph classification model with a global optimization goal. Experiments on real-world functional brain analysis and chemical compound categorization demonstrate the algorithm’s performance. Results confirm that our method can be used to explicitly capture task correlations and uniqueness in the feature space, and explicitly answer what are shared between tasks and what is the uniqueness of a specific task.

[1]  George Karypis,et al.  Frequent substructure-based approaches for classifying chemical compounds , 2003, IEEE Transactions on Knowledge and Data Engineering.

[2]  Philip S. Yu,et al.  Bag Constrained Structure Pattern Mining for Multi-Graph Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[3]  Stéphane Canu,et al.  $\ell_{p}-\ell_{q}$ Penalty for Sparse Linear and Sparse Multiple Kernel Multitask Learning , 2011, IEEE Transactions on Neural Networks.

[4]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[5]  R Cameron Craddock,et al.  A whole brain fMRI atlas generated via spatially constrained spectral clustering , 2012, Human brain mapping.

[6]  Chengqi Zhang,et al.  Multi-Graph-View Learning for Complicated Object Classification , 2015, IJCAI.

[7]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[8]  Chengqi Zhang,et al.  Multi-graph-view Learning for Graph Classification , 2014, 2014 IEEE International Conference on Data Mining.

[9]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[10]  Ivor W. Tsang,et al.  Towards ultrahigh dimensional feature selection for big data , 2012, J. Mach. Learn. Res..

[11]  Leon Wenliang Zhong,et al.  Convex Multitask Learning with Flexible Task Clusters , 2012, ICML.

[12]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[13]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[14]  Ovidiu Ivanciuc,et al.  Chemical graphs, molecular matrices and topological indices in chemoinformatics and quantitative structure-activity relationships. , 2013, Current computer-aided drug design.

[15]  Yuji Matsumoto,et al.  An Application of Boosting to Graph Classification , 2004, NIPS.

[16]  Massimiliano Pontil,et al.  Exploiting Unrelated Tasks in Multi-Task Learning , 2012, AISTATS.

[17]  J. E. Kelley,et al.  The Cutting-Plane Method for Solving Convex Programs , 1960 .

[18]  Shirui Pan,et al.  Finding the best not the most: regularized loss minimization subgraph selection for graph classification , 2015, Pattern Recognit..

[19]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[20]  Philip S. Yu,et al.  Brain network analysis: a data mining perspective , 2014, SKDD.

[21]  H. Kashima,et al.  Kernels for graphs , 2004 .

[22]  Jiayu Zhou,et al.  Efficient multi-task feature learning with calibration , 2014, KDD.

[23]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[24]  Philip S. Yu,et al.  Graph stream classification using labeled and unlabeled graphs , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[25]  Chengqi Zhang,et al.  Multi-Graph Learning with Positive and Unlabeled Bags , 2014, SDM.

[26]  Min Song,et al.  Text Categorization of Biomedical Data Sets Using Graph Kernels and a Controlled Vocabulary , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[27]  Hongliang Fei,et al.  Structured Feature Selection and Task Relationship Inference for Multi-task Learning , 2011, ICDM.

[28]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[29]  Dimitri Van De Ville,et al.  Machine Learning with Brain Graphs: Predictive Modeling Approaches for Functional Imaging in Systems Neuroscience , 2013, IEEE Signal Processing Magazine.

[30]  Hal Daumé,et al.  Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.

[31]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[32]  Philip S. Yu,et al.  Multi-label Feature Selection for Graph Classification , 2010, 2010 IEEE International Conference on Data Mining.

[33]  Philip S. Yu,et al.  Positive and Unlabeled Learning for Graph Classification , 2011, 2011 IEEE 11th International Conference on Data Mining.

[34]  Ivor W. Tsang,et al.  Feature Disentangling Machine - A Novel Approach of Feature Selection and Disentangling in Facial Expression Analysis , 2014, ECCV.

[35]  Philip S. Yu,et al.  Near-optimal Supervised Feature Selection among Frequent Subgraphs , 2009, SDM.

[36]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[37]  Chengqi Zhang,et al.  Defragging Subgraph Features for Graph Classification , 2015, CIKM.

[38]  Mark Stamp,et al.  Deriving common malware behavior through graph clustering , 2013, Comput. Secur..

[39]  Dit-Yan Yeung,et al.  A Convex Formulation for Learning Task Relationships in Multi-Task Learning , 2010, UAI.

[40]  Wei Wang,et al.  GAIA: graph classification using evolutionary computation , 2010, SIGMOD Conference.

[41]  Philip S. Yu,et al.  Semi-supervised feature selection for graph classification , 2010, KDD.

[42]  G. Nemhauser,et al.  Integer Programming , 2020 .

[43]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[44]  Hongliang Fei,et al.  Boosting with structure information in the functional space: an application to graph classification , 2010, KDD.

[45]  Mohammed J. Zaki,et al.  Approximate graph mining with label costs , 2013, KDD.

[46]  Carey E. Priebe,et al.  Graph Classification Using Signal-Subgraphs: Applications in Statistical Connectomics , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Shirui Pan,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Graph Classification with Imbalanced Class Distributions and Noise ∗ , 2022 .

[48]  Jieping Ye,et al.  Robust multi-task feature learning , 2012, KDD.

[49]  Zhihua Cai,et al.  Boosting for Multi-Graph Classification , 2015, IEEE Transactions on Cybernetics.

[50]  Ambuj K. Singh,et al.  GraphSig: A Scalable Approach to Mining Significant Subgraphs in Large Graph Databases , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[51]  Stephen P. Boyd,et al.  A minimax theorem with applications to machine learning, signal processing, and finance , 2007, 2007 46th IEEE Conference on Decision and Control.

[52]  Philip S. Yu,et al.  Transfer Significant Subgraphs across Graph Databases , 2012, SDM.

[53]  Wei Wang,et al.  Graph classification based on pattern co-occurrence , 2009, CIKM.

[54]  Chengqi Zhang,et al.  Graph Ensemble Boosting for Imbalanced Noisy Graph Stream Classification , 2015, IEEE Transactions on Cybernetics.

[55]  Jia Wu,et al.  CogBoost: Boosting for Fast Cost-Sensitive Graph Classification , 2015, IEEE Transactions on Knowledge and Data Engineering.

[56]  Dit-Yan Yeung,et al.  Multi-Task Boosting by Exploiting Task Relationships , 2012, ECML/PKDD.

[57]  Sebastian Nowozin,et al.  gBoost: a mathematical programming approach to graph classification and regression , 2009, Machine Learning.

[58]  Ali Jalali,et al.  A Dirty Model for Multi-task Learning , 2010, NIPS.

[59]  Philip S. Yu,et al.  Dual active feature and sample selection for graph classification , 2011, KDD.

[60]  Zhibin Hong,et al.  Tracking via Robust Multi-task Multi-view Joint Sparse Representation , 2013, 2013 IEEE International Conference on Computer Vision.

[61]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[62]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[63]  Ljupco Kocarev,et al.  Machine learning approach for classification of ADHD adults. , 2014, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.