Particle swarm optimization for network-based data classification

Complex networks provide a powerful tool for data representation due to its ability to describe the interplay between topological, functional, and dynamical properties of the input data. A fundamental process in network-based (graph-based) data analysis techniques is the network construction from original data usually in vector form. Here, a natural question is: How to construct an "optimal" network regarding a given processing goal? This paper investigates structural optimization in the context of network-based data classification tasks. To be specific, we propose a particle swarm optimization framework which is responsible for building a network from vector-based data set while optimizing a quality function driven by the classification accuracy. The classification process considers both topological and physical features of the training and test data and employing PageRank measure for classification according to the importance concept of a test instance to each class. Results on artificial and real-world problems reveal that data network generated using structural optimization provides better results in general than those generated by classical network formation methods. Moreover, this investigation suggests that other kinds of network-based machine learning and data mining tasks, such as dimensionality reduction and data clustering, can benefit from the proposed structural optimization method.

[1]  Yaochu Jin,et al.  A social learning particle swarm optimization algorithm for scalable optimization , 2015, Inf. Sci..

[2]  Alessandro Vespignani,et al.  Epidemic spreading in scale-free networks. , 2000, Physical review letters.

[3]  Liang Zhao,et al.  Data heterogeneity consideration in semi-supervised learning , 2016, Expert Syst. Appl..

[4]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[5]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[6]  Liang Zhao,et al.  Network-Based High Level Data Classification , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Alneu de Andrade Lopes,et al.  Network-based data classification: combining K-associated optimal graphs and high-level prediction , 2013, Journal of the Brazilian Computer Society.

[9]  Liang Zhao,et al.  K-associated optimal network for graph embedding dimensionality reduction , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[10]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[11]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[12]  Liang Zhao,et al.  Data clustering using controlled consensus in complex networks , 2013, Neurocomputing.

[13]  Liang Zhao,et al.  Uncovering overlapping cluster structures via stochastic competitive learning , 2013, Inf. Sci..

[14]  S. N. Dorogovtsev,et al.  Evolution of networks , 2001, cond-mat/0106144.

[15]  Shuicheng Yan,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007 .

[16]  Liang Zhao,et al.  Organizational Data Classification Based on the Importance Concept of Complex Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Liang Zhao,et al.  Network-based supervised data classification by using an heuristic of ease of access , 2015, Neurocomputing.

[18]  Yousef Saad,et al.  Fast Approximate kNN Graph Construction for High Dimensional Data via Recursive Lanczos Bisection , 2009, J. Mach. Learn. Res..

[19]  R. Solé,et al.  Optimization in Complex Networks , 2001, cond-mat/0111222.

[20]  M. Newman Communities, modules and large-scale structure in networks , 2011, Nature Physics.

[21]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[22]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[23]  Liang Zhao,et al.  Machine Learning in Complex Networks , 2016, Springer International Publishing.

[24]  S. Strogatz Exploring complex networks , 2001, Nature.

[25]  Junbao Zhang,et al.  A scheme for high level data classification using random walk and network measures , 2018, Expert Syst. Appl..

[26]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[27]  A. Barabasi,et al.  Scale-free characteristics of random networks: the topology of the world-wide web , 2000 .

[28]  R. Abseher,et al.  Microarray gene expression profiling of B-cell chronic lymphocytic leukemia subgroups defined by genomic aberrations and VH mutation status. , 2004, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[29]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[30]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[31]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[32]  Yuji Matsumoto,et al.  Using the Mutual k-Nearest Neighbor Graphs for Semi-supervised Classification on Natural Language Data , 2011, CoNLL.

[33]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[34]  Liang Zhao,et al.  A nonparametric classification method based on K-associated graphs , 2011, Inf. Sci..

[35]  Shih-Fu Chang,et al.  Graph construction and b-matching for semi-supervised learning , 2009, ICML '09.

[36]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2008, IEEE Trans. Knowl. Data Eng..

[37]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[38]  Liang Zhao,et al.  Network-Based Stochastic Semisupervised Learning , 2012, IEEE Transactions on Neural Networks and Learning Systems.