On extending extreme learning machine to non-redundant synergy pattern based graph classification

Graph patterns are widely used to define the feature space for building an efficient graph classification model. Synergy graph patterns refer to those graphs, where the relationships among the nodes are highly inseparable. Compared with the general graph patterns, synergy graph patterns which have much higher discriminative powers are more suitable as the classification features. Extreme Learning Machine (ELM) is a simple and efficient Single-hidden Layer Feedforward neural Networks (SLFNs) algorithm with extremely fast learning capacity. In this paper we propose the problem of extending ELM to non-redundant synergy pattern based graph classification.The graph classification framework being widely used consists of two steps, namely feature generation and classification. The first issue is how to quickly obtain significant graph pattern features from a graph database. The next step is how to effectively build a graph classification model with these graph pattern features. An efficient depth-first algorithm, called GINS, was presented to find all non-redundant synergy graph patterns. Also, based on the proposed Support Graph Vector Model (SGVM) and ELM algorithm, the graph classification model was constructed. Extensive experiments are conducted on a series of real-life datasets. The results show that GINS is more efficient than two representative competitors. Besides, when the generated graph patterns are considered as the classification features, the GINS+ELM classification accuracy can be improved much.

[1]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[2]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[3]  Wei Wang,et al.  LTS: Discriminative subgraph mining by learning from search history , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[4]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[5]  Lei Chen,et al.  Enhanced random search based incremental extreme learning machine , 2008, Neurocomputing.

[6]  Amaury Lendasse,et al.  OP-ELM: Optimally Pruned Extreme Learning Machine , 2010, IEEE Transactions on Neural Networks.

[7]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[8]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[9]  Qinyu. Zhu Extreme Learning Machine , 2013 .

[10]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[11]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[12]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[13]  Sokal Rr,et al.  Biometry: the principles and practice of statistics in biological research 2nd edition. , 1981 .

[14]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[15]  Guang-Bin Huang,et al.  Convex incremental extreme learning machine , 2007, Neurocomputing.

[16]  Luc De Raedt,et al.  Data Mining and Machine Learning Techniques for the Identification of Mutagenicity Inducing Substructures and Structure Activity Relationships of Noncongeneric Compounds , 2004, J. Chem. Inf. Model..

[17]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[18]  Chee Kheong Siew,et al.  Can threshold networks be trained directly? , 2006, IEEE Transactions on Circuits and Systems II: Express Briefs.

[19]  Ambuj K. Singh,et al.  GraphSig: A Scalable Approach to Mining Significant Subgraphs in Large Graph Databases , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[20]  Amaury Lendasse,et al.  TROP-ELM: A double-regularized ELM using LARS and Tikhonov regularization , 2011, Neurocomputing.

[21]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[22]  Hongming Zhou,et al.  Optimization method based extreme learning machine for classification , 2010, Neurocomputing.

[23]  F. James Rohlf,et al.  Biometry: The Principles and Practice of Statistics in Biological Research , 1969 .

[24]  B. Yener,et al.  Cell-Graph Mining for Breast Tissue Modeling and Classification , 2007, 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[25]  Tian Zheng,et al.  Identification of gene interactions associated with disease from gene expression data using synergy networks , 2008, BMC Systems Biology.

[26]  Joost N. Kok,et al.  The Gaston Tool for Frequent Subgraph Mining , 2005, GraBaTs.

[27]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[28]  Christian Borgelt,et al.  MoSS: a program for molecular substructure mining , 2005 .

[29]  Wei Wang,et al.  Mining protein family specific residue packing patterns from protein structure graphs , 2004, RECOMB.