Incremental Subgraph Feature Selection for Graph Classification

Graph classification is an important tool for analyzing data with structure dependency, where subgraphs are often used as features for learning. In reality, the dimension of the subgraphs crucially depends on the threshold setting of the frequency support parameter, and the number may become extremely large. As a result, subgraphs may be incrementally discovered to form a feature stream and require the underlying graph classifier to effectively discover representative subgraph features from the subgraph feature stream. In this paper, we propose a primal-dual incremental subgraph feature selection algorithm (ISF) based on a max-margin graph classifier. The ISF algorithm constructs a sequence of solutions that are both primal and dual feasible. Each primal-dual pair shrinks the dual gap and renders a better solution for the optimal subgraph feature set. To avoid bias of ISF algorithm on short-pattern subgraph features, we present a new incremental subgraph join feature selection algorithm (ISJF) by forcing graph classifiers to join short-pattern subgraphs and generate long-pattern subgraph features. We evaluate the performance of the proposed models on both synthetic networks and real-world social network data sets. Experimental results demonstrate the effectiveness of the proposed methods.

[1]  M. Sion On general minimax theorems , 1958 .

[2]  Yoram Singer,et al.  Online Learning Meets Optimization in the Dual , 2006, COLT.

[3]  Philip S. Yu,et al.  Feature-based similarity search in graph structures , 2006, TODS.

[4]  Jiawei Han,et al.  Mining closed relational graphs with connectivity constraints , 2005, 21st International Conference on Data Engineering (ICDE'05).

[5]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[6]  Joseph Naor,et al.  Online Primal-Dual Algorithms for Covering and Packing , 2009, Math. Oper. Res..

[7]  Xue Chen,et al.  Discovering small-world in association link networks for association learning , 2012, World Wide Web.

[8]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[9]  Kaspar Riesen,et al.  Graph Classification and Clustering Based on Vector Space Embedding , 2010, Series in Machine Perception and Artificial Intelligence.

[10]  Fei Wang,et al.  Cascading outbreak prediction in networks: a data-driven approach , 2013, KDD.

[11]  Jing Zhou,et al.  Streaming Feature Selection using IIC , 2005, AISTATS.

[12]  James Theiler,et al.  Online Feature Selection using Grafting , 2003, ICML.

[13]  Kaspar Riesen,et al.  Graph Classification by Means of Lipschitz Embedding , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Jeffrey Xu Yu,et al.  High efficiency and quality: large graphs matching , 2011, CIKM '11.

[15]  Sebastian Nowozin,et al.  gBoost: a mathematical programming approach to graph classification and regression , 2009, Machine Learning.

[16]  Ivor W. Tsang,et al.  The Emerging "Big Dimensionality" , 2014, IEEE Computational Intelligence Magazine.

[17]  Zizhuo Wang,et al.  A Dynamic Near-Optimal Algorithm for Online Linear Programming , 2009, Oper. Res..

[18]  Philip S. Yu,et al.  Near-optimal Supervised Feature Selection among Frequent Subgraphs , 2009, SDM.

[19]  George Karypis,et al.  Frequent substructure-based approaches for classifying chemical compounds , 2003, IEEE Transactions on Knowledge and Data Engineering.

[20]  Chengqi Zhang,et al.  Defragging Subgraph Features for Graph Classification , 2015, CIKM.

[21]  Hao Wang,et al.  Online Streaming Feature Selection , 2010, ICML.

[22]  Hao Wang,et al.  Online Feature Selection with Streaming Features , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Huan Liu,et al.  Feature Selection: An Ever Evolving Frontier in Data Mining , 2010, FSDM.

[24]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[25]  Yuji Matsumoto,et al.  An Application of Boosting to Graph Classification , 2004, NIPS.

[26]  Nicole Immorlica,et al.  Online auctions and generalized secretary problems , 2008, SECO.

[27]  Jing Zhou,et al.  Streamwise Feature Selection , 2006, J. Mach. Learn. Res..

[28]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[29]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints , 2011, IEEE Transactions on Knowledge and Data Engineering.

[30]  Avleen Singh Bijral,et al.  Mini-Batch Primal and Dual Methods for SVMs , 2013, ICML.

[31]  Ivor W. Tsang,et al.  Learning Sparse SVM for Feature Selection on Very High Dimensional Datasets , 2010, ICML.

[32]  Charu C. Aggarwal,et al.  Classification and Adaptive Novel Class Detection of Feature-Evolving Data Streams , 2013, IEEE Transactions on Knowledge and Data Engineering.

[33]  Manik Varma,et al.  More generality in efficient multiple kernel learning , 2009, ICML '09.

[34]  Michèle Sebag,et al.  Data Stream Clustering With Affinity Propagation , 2014, IEEE Transactions on Knowledge and Data Engineering.

[35]  Ivor W. Tsang,et al.  Towards ultrahigh dimensional feature selection for big data , 2012, J. Mach. Learn. Res..

[36]  Béla Bollobás,et al.  Random Graphs: Notation , 2001 .

[37]  Chengqi Zhang,et al.  Graph Ensemble Boosting for Imbalanced Noisy Graph Stream Classification , 2015, IEEE Transactions on Cybernetics.

[38]  Jia Wu,et al.  CogBoost: Boosting for Fast Cost-Sensitive Graph Classification , 2015, IEEE Transactions on Knowledge and Data Engineering.

[39]  Hongliang Fei,et al.  Structured Sparse Boosting for Graph Classification , 2014, TKDD.

[40]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[41]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[42]  Geoff Holmes,et al.  MOA Concept Drift Active Learning Strategies for Streaming Data , 2011, WAPA.

[43]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[44]  Philip S. Yu,et al.  Semi-supervised feature selection for graph classification , 2010, KDD.

[45]  Philip S. Yu,et al.  A framework for on-demand classification of evolving data streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[46]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[47]  Marco A. López,et al.  Semi-infinite programming , 2007, Eur. J. Oper. Res..

[48]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[49]  Akihiro Inokuchi Mining generalized substructures from a set of labeled graphs , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).