A unified distributed ELM framework with supervised, semi-supervised and unsupervised big data learning

Extreme learning machine (ELM) as well as its variants have been widely used in many fields for its good generalization performance and fast learning speed. Though distributed ELM can sufficiently process large-scale labeled training data, the current technology is not able to process partial labeled or unlabeled training data. Therefore, we propose a new unified distributed ELM with supervised, semi-supervised and unsupervised learning based on MapReduce framework, called U-DELM. The U-DELM method can be used to overcome the existing distributed ELM framework’s lack of ability to process partially labeled and unlabeled training data. We first compare the computation formulas of supervised, semi-supervised and unsupervised learning methods and found that the majority of expensive computations are decomposable. Next, MapReduce framework based U-DELM is proposed, which extracts three different matrices continued multiplications from the three computational formulas introduced above. After that, we transform the cumulative sums respectively to make them suitable for MapReduce. Then, the combination of the three computational formulas are used to solve the output weight in three different learning methods. Finally, by using benchmark and synthetic datasets, we are able to test and verify the efficiency and effectiveness of U-DELM on learning massive data. Results prove that U-DELM can achieve unified distribution on supervised, semi-supervised and unsupervised learning.

[1]  Zhiqiong Wang,et al.  ELM ∗ : distributed extreme learning machine with MapReduce , 2013, World Wide Web.

[2]  Feng Xia,et al.  MapReduce: Review and open challenges , 2016, Scientometrics.

[3]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[4]  Ge Yu,et al.  A-ELM⁎: Adaptive Distributed Extreme Learning Machine with MapReduce , 2016, Neurocomputing.

[5]  Arbee L. P. Chen,et al.  MapReduce skyline query processing with partitioning and distributed dominance tests , 2017, Inf. Sci..

[6]  Filomena Ferrucci,et al.  Using Hadoop MapReduce for Parallel Genetic Algorithms: A Comparison of the Global, Grid and Island Models , 2018, Evolutionary Computation.

[7]  Beng Chin Ooi,et al.  Efficient Processing of k Nearest Neighbor Joins using MapReduce , 2012, Proc. VLDB Endow..

[8]  Cheng Wu,et al.  Semi-Supervised and Unsupervised Extreme Learning Machines , 2014, IEEE Transactions on Cybernetics.

[9]  Yiqiang Chen,et al.  Weighted extreme learning machine for imbalance learning , 2013, Neurocomputing.

[10]  Kyuseok Shim,et al.  Efficient Processing of Skyline Queries Using MapReduce , 2017, IEEE Transactions on Knowledge and Data Engineering.

[11]  Ge Yu,et al.  Breast tumor detection in digital mammography based on extreme learning machine , 2014, Neurocomputing.

[12]  Fuzhen Zhuang,et al.  Parallel extreme learning machine for regression based on MapReduce , 2013, Neurocomputing.

[13]  Ge Yu,et al.  Distributed and weighted extreme learning machine for imbalanced big data learning , 2017 .

[14]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[15]  Ge Yu,et al.  An efficient parallel method for batched OS-ELM training using MapReduce , 2016, Memetic Computing.

[16]  Xue Yang,et al.  An Extreme Learning Machine based on Cellular Automata of edge detection for remote sensing images , 2016, Neurocomputing.

[17]  Lijun Chang,et al.  Scalable subgraph enumeration in MapReduce: a cost-oriented approach , 2017, The VLDB Journal.

[18]  Ge Yu,et al.  Breast tumor detection in double views mammography based on extreme learning machine , 2014, Neural Computing and Applications.

[19]  Fuchun Sun,et al.  Denoising deep extreme learning machine for sparse representation , 2017, Memetic Comput..

[20]  Han Zou,et al.  Robust Extreme Learning Machine With its Application to Indoor Positioning , 2016, IEEE Transactions on Cybernetics.

[21]  Zhiqiong Wang,et al.  Elastic extreme learning machine for big data classification , 2015, Neurocomputing.

[22]  Mariette Awad,et al.  On the Distributed Implementation of Unsupervised Extreme Learning Machines for Big Data , 2015, INNS Conference on Big Data.

[23]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[24]  Ying Yin,et al.  Improving ELM-based microarray data classification by diversified sequence features selection , 2014, Neural Computing and Applications.

[25]  Chi-Man Vong,et al.  Sparse Bayesian extreme learning machine and its application to biofuel engine performance prediction , 2015, Neurocomputing.

[26]  Ruhul A. Sarker,et al.  Differential evolution framework for big data optimization , 2016, Memetic Comput..