An OS-ELM based distributed ensemble classification framework in P2P networks

Abstract Although classification in centralized environments has been widely studied in recent years, it is still an important research problem for classification in P2P networks due to the popularity of P2P computing environments. The main target of classification in P2P networks is how to efficiently decrease prediction error with small network overhead. In this paper, we propose an OS-ELM based ensemble classification framework for distributed classification in a hierarchical P2P network. In the framework, we apply the incremental learning principle of OS-ELM to the hierarchical P2P network to generate an ensemble classifier. There are two kinds of implementation methods of the ensemble classifier in the P2P network, one-by-one ensemble classification and parallel ensemble classification. Furthermore, we propose a data space coverage based peer selection approach to reduce high the communication cost and large delay. We also design a two-layer index structure to efficiently support peer selection. A peer creates a local Quad-tree to index its local data and a super-peer creates a global Quad-tree to summarize its local indexes. Extensive experimental studies verify the efficiency and effectiveness of the proposed algorithms.

[1]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[2]  Narasimhan Sundararajan,et al.  A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks , 2006, IEEE Transactions on Neural Networks.

[3]  Gene H. Golub,et al.  Matrix computations , 1983 .

[4]  Amit Agarwal,et al.  A new machine learning paradigm for terrain reconstruction , 2006, IEEE Geoscience and Remote Sensing Letters.

[5]  V. Samoilov,et al.  Agent-based Service-Oriented Intelligent P2P Networks for Distributed Classification , 2006, 2006 International Conference on Hybrid Information Technology.

[6]  Pavel Berkhin,et al.  KDD-2007 : proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 12-15, 2007, San Jose, CA, USA , 2007 .

[7]  Chee Kheong Siew,et al.  Extreme learning machine: RBF network case , 2004, ICARCV 2004 8th Control, Automation, Robotics and Vision Conference, 2004..

[8]  Narasimhan Sundararajan,et al.  Classification of Mental Tasks from Eeg Signals Using Extreme Learning Machine , 2006, Int. J. Neural Syst..

[9]  Steven C. H. Hoi,et al.  Classification in P2P Networks by Bagging Cascade RSVMs , 2008, DBISP2P.

[10]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[11]  Salvatore J. Stolfo,et al.  A Comparative Evaluation of Voting and Meta-learning on Partitioned Data , 1995, ICML.

[12]  Stefan Siersdorfer,et al.  Automatic Document Organization in a P2P Environment , 2006, ECIR.

[13]  Ran Wolff,et al.  Distributed Data Mining in Peer-to-Peer Networks , 2006, IEEE Internet Computing.

[14]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[15]  Steven C. H. Hoi,et al.  Cascade RSVM in Peer-to-Peer Networks , 2008, ECML/PKDD.

[16]  P. Saratchandran,et al.  Multicategory Classification Using An Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Hanan Samet,et al.  Using a distributed quadtree index in peer-to-peer networks , 2007, The VLDB Journal.