Multi-class instance-incremental framework for classification in fully dynamic graphs

Existing work in the area of graph classification is mostly restricted to static graphs. These static classification models prove ineffective in several real life scenarios that require an approach capable of handling data of a dynamic nature. Further, the limited work in the domain of dynamic graphs mainly focuses on solely incremental graphs which fail to accommodate fully dynamic graphs (FDG). Hence, in this paper, we propose a comprehensive framework targeting multi-class classification in fully dynamic graphs by utilising the efficient Weisfeiler-Lehman graph kernel (W-L) with a multi-class support vector machine (SVM). The framework iterates through each update using the instance-incremental method while retaining all historical data in order to ensure higher accuracy. Reliable validation metrics are utilised for the model parameter selection and output verification. Experimental results over four case studies on real-world data demonstrate the efficacy of our approach.

[1]  Paul Zikopoulos,et al.  Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data , 2011 .

[2]  Philip S. Yu,et al.  Positive and Unlabeled Learning for Graph Classification , 2011, 2011 IEEE 11th International Conference on Data Mining.

[3]  Philip S. Yu,et al.  Graph Classification in Heterogeneous Networks , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[4]  Lawrence B. Holder,et al.  Empirical comparison of graph classification algorithms , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[5]  Charu C. Aggarwal,et al.  On Classification of Graph Streams , 2011, SDM.

[6]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[7]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[8]  Lawrence B. Holder,et al.  Classification in dynamic streaming networks , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[9]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[10]  Bernhard Scholkopf,et al.  Support Vector Machines: A Practical Consequence of Learning Theory , 1998 .

[11]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[12]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[13]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[14]  Long Jin,et al.  Understanding Graph Sampling Algorithms for Social Network Analysis , 2011, 2011 31st International Conference on Distributed Computing Systems Workshops.

[15]  Karsten M. Borgwardt,et al.  Fast subtree kernels on graphs , 2009, NIPS.

[16]  Andrés Gago Alonso,et al.  A new proposal for graph classification using frequent geometric subgraphs , 2013, Data Knowl. Eng..

[17]  Geoff Holmes,et al.  Batch-Incremental versus Instance-Incremental Learning in Dynamic and Evolving Data , 2012, IDA.

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Zaïd Harchaoui,et al.  Image Classification with Segmentation Graph Kernels , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Lawrence B. Holder,et al.  Scalable SVM-Based Classification in Dynamic Graphs , 2014, 2014 IEEE International Conference on Data Mining.

[21]  George Karypis,et al.  Frequent substructure-based approaches for classifying chemical compounds , 2003, IEEE Transactions on Knowledge and Data Engineering.

[22]  Charu C. Aggarwal,et al.  On Node Classification in Dynamic Content-based Networks , 2011, SDM.

[23]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[24]  Reinhard Schneider,et al.  Using graph theory to analyze biological networks , 2011, BioData Mining.

[25]  M. Aly Survey on Multiclass Classification Methods , 2005 .

[26]  Thomas Hofmann,et al.  Predicting structured objects with support vector machines , 2009, Commun. ACM.

[27]  T. Poggio,et al.  Regularized Least-Squares Classification 133 In practice , although , 2007 .

[28]  Fei-Yue Wang,et al.  Intelligent systems and technology for integrative and predictive medicine: An ACP approach , 2013, TIST.

[29]  Michalis Vazirgiannis,et al.  Text Categorization as a Graph Classification Problem , 2015, ACL.

[30]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[31]  Menouar Boulif,et al.  Multi-objective cell formation with routing flexibility: a graph partitioning approach , 2015, Int. J. Comput. Sci. Eng..

[32]  Yingshu Li,et al.  Time constraint influence maximization algorithm in the age of big data , 2017, Int. J. Comput. Sci. Eng..

[33]  Anantharaman Kalyanaraman,et al.  Parallel algorithms for clustering biological graphs on distributed and shared memory architectures , 2014, Int. J. High Perform. Comput. Netw..

[34]  F. Mosteller,et al.  A comparative study of discrimination methods applied to the authorship of the disputed Federalist papers , 2016 .