Incremental SVM-based classification in dynamic streaming networks

With the emergence of networked data, graph classification has received considerable interest during the past years. Most approaches to graph classification focus on designing effective kernels to compute similarities for static graphs. However, they become computationally intractable in terms of time and space when a graph is presented in an incremental fashion with continuous updates, i.e., insertions of nodes and edges. In this paper, we examine the problem of classification in large-scale and incrementally changing graphs. We propose a framework combining an incremental support vector machine (SVM) with the Weisfeiler-Lehman (W-L) graph kernel. By retaining the support vectors from each learning step, the classification model is incrementally updated whenever new changes are made to the graph. We design an entropy-based subgraph extraction strategy, that selects informative neighbor nodes and discards those with less discriminative power, to facilitate the classification of nodes in a dynamic network. We validate the advantages of our learning techniques by conducting an empirical evaluation on several large-scale real-world graph datasets in comparison with other graph classification methods. The experimental results also validate the benefits of our subgraph extraction method when combined with the incremental learning techniques.

[1]  Philip S. Yu,et al.  Dual active feature and sample selection for graph classification , 2011, KDD.

[2]  Shirui Pan,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Graph Classification with Imbalanced Class Distributions and Noise ∗ , 2022 .

[3]  Arik Azran,et al.  The rendezvous algorithm: multiclass semi-supervised learning with Markov random walks , 2007, ICML '07.

[4]  Bin Li,et al.  Fast Graph Stream Classification Using Discriminative Clique Hashing , 2013, PAKDD.

[5]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[6]  Lawrence B. Holder,et al.  Scalable SVM-Based Classification in Dynamic Graphs , 2014, 2014 IEEE International Conference on Data Mining.

[7]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[8]  Philip S. Yu,et al.  Graph stream classification using labeled and unlabeled graphs , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[9]  Jan Ramon,et al.  Expressivity versus efficiency of graph kernels , 2003 .

[10]  Stefan Rüping,et al.  Incremental Learning with Support Vector Machines , 2001, ICDM.

[11]  Charu C. Aggarwal,et al.  On Classification of Graph Streams , 2011, SDM.

[12]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[13]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[14]  Dennis Shasha,et al.  StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.

[15]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[16]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[17]  Foster Provost,et al.  Simple Models and Classification in Networked Data , 2004 .

[18]  Jon M. Kleinberg,et al.  Overview of the 2003 KDD Cup , 2003, SKDD.

[19]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[20]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[21]  A. John MINING GRAPH DATA , 2022 .

[22]  Huan Liu,et al.  Handling concept drifts in incremental learning with support vector machines , 1999, KDD '99.

[23]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[24]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[25]  Jennifer Neville,et al.  Learning relational probability trees , 2003, KDD '03.

[26]  Dimitrios Gunopulos,et al.  Incremental support vector machine construction , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[27]  Nikhil S. Ketkar,et al.  Mining in the Proximity of Subgraphs , 2006 .

[28]  Karsten M. Borgwardt,et al.  Fast subtree kernels on graphs , 2009, NIPS.

[29]  Chengqi Zhang,et al.  Nested Subtree Hash Kernels for Large-Scale Graph Classification over Streams , 2012, 2012 IEEE 12th International Conference on Data Mining.

[30]  Graham Cormode,et al.  Node Classification in Social Networks , 2011, Social Network Data Analytics.

[31]  Charu C. Aggarwal,et al.  On Node Classification in Dynamic Content-based Networks , 2011, SDM.

[32]  S. V. N. Vishwanathan,et al.  Fast Computation of Graph Kernels , 2006, NIPS.

[33]  Lawrence B. Holder,et al.  Mining Graph Data: Cook/Mining Graph Data , 2006 .