The Parallelization of Back Propagation Neural Network in MapReduce and Spark

Artificial neural network is proved to be an effective algorithm for dealing with recognition, regression and classification tasks. At present a number of neural network implementations have been developed, for example Hamming network, Grossberg network, Hopfield network and so on. Among these implementations, back propagation neural network (BPNN) has become the most popular one due to its sensational function approximation and generalization abilities. However, in the current big data researches, BPNN, as a both data intensive and computational intensive algorithm, its efficiency has been significantly impacted. Therefore, this paper presents a parallel BPNN algorithm based on data separation in three distributed computing environments including Hadoop, HaLoop and Spark. Moreover to improve the algorithm performance in terms of accuracy, ensemble techniques have been employed. The algorithm is firstly evaluated in a small-scale cluster. And then it is further evaluated in a commercial cloud computing environment. The experimental results indicate that the proposed algorithm can improve the efficiency of BPNN with guaranteeing its accuracy.

[1]  V. Rao,et al.  Application of Artificial Neural Networks in Capacity Planning of Cloud Based IT Infrastructure , 2012, 2012 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM).

[2]  Christian Bach,et al.  Neural network based cloud computing platform for bioinformatics , 2013, 2013 IEEE Long Island Systems, Applications and Technology Conference (LISAT).

[3]  Jing Zhang,et al.  Application of back propagation neural network in the classification of high resolution remote sensing image: Take remote sensing image of beijing for instance , 2010, 2010 18th International Conference on Geoinformatics.

[4]  Hamidreza Kanan Reduction of Neural Network Training Time Using anAdaptive Fuzzy Approach in Real Time Applications , 2012 .

[5]  Martin T. Hagan,et al.  Neural network design , 1995 .

[6]  Tarek M. Taha,et al.  Routing bandwidth model for feed forward neural networks on multicore neuromorphic architectures , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[7]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[8]  D. P. Kothari,et al.  Generalized Neural Network Approach for Global Solar Energy Estimation in India , 2012, IEEE Transactions on Sustainable Energy.

[9]  Nasullah Khalid Alham,et al.  Parallelizing support vector machines for scalable image annotation , 2011 .

[10]  Yanfeng Zhang,et al.  iMapReduce: A Distributed Computing Framework for Iterative Computation , 2011, Journal of Grid Computing.

[11]  Michael D. Ernst,et al.  HaLoop , 2010, Proc. VLDB Endow..

[12]  Yang Liu,et al.  MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning , 2015, Comput. Intell. Neurosci..

[13]  Chun-Yu Wang,et al.  FedLoop: Looping on Federated MapReduce , 2014, 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications.

[14]  Erich Schikuta,et al.  Parallelized neural networks as a service , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[15]  Rong Luo,et al.  Energy efficient neural networks for big data analytics , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[16]  K. Sakakibara,et al.  Stock Price Forecasting using Back Propagation Neural Networks with Time and Profit Based Adjusted Weight Factors , 2006, 2006 SICE-ICASE International Joint Conference.

[17]  Hongyan Li,et al.  MapReduce-based Backpropagation Neural Network over large scale mobile data , 2010, 2010 Sixth International Conference on Natural Computation.

[18]  Rong Gu,et al.  A parallel computing platform for training large scale neural networks , 2013, 2013 IEEE International Conference on Big Data.

[19]  Bernard Widrow,et al.  Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[20]  Shucheng Yu,et al.  Privacy Preserving Back-Propagation Neural Network Learning Made Practical with Cloud Computing , 2014, IEEE Transactions on Parallel and Distributed Systems.

[21]  Mohammad Al Hasan,et al.  An Iterative MapReduce Based Frequent Subgraph Mining Algorithm , 2013, IEEE Transactions on Knowledge and Data Engineering.

[22]  Lyle N. Long,et al.  Scalable Massively Parallel Artificial Neural Networks , 2005, J. Aerosp. Comput. Inf. Commun..

[23]  Maozhen Li,et al.  A MapReduce Based Distributed LSI for Scalable Information Retrieval , 2014, Comput. Informatics.