A Low-Cost Energy-Efficient Raspberry Pi Cluster for Data Mining Algorithms

Data mining algorithms are essential tools to extract information from the increasing number of large datasets, also called Big Data. However, these algorithms demand huge amounts of computing power to achieve reliable results. Although conventional High Performance Computing (HPC) platforms can deliver such performance, they are commonly expensive and power-hungry. This paper presents a study of an unconventional low-cost energy-efficient HPC cluster composed of Raspberry Pi nodes. The performance, power and energy efficiency obtained from this unconventional platform is compared with a well-known coprocessor used in HPC (Intel Xeon Phi) for two data mining algorithms: Apriori and K-Means. The experimental results showed that the Raspberry Pi cluster can consume up to \(88.35\%\) and \(85.17\%\) less power than Intel Xeon Phi when running Apriori and K-Means, respectively, and up to \(45.51\%\) less energy when running Apriori.

[1]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Horst D. Simon Barriers to Exascale Computing , 2012, VECPAR.

[3]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[4]  Yuzhong Shen,et al.  Energy Evaluation for Applications with Different Thread Affinities on the Intel Xeon Phi , 2014, 2014 International Symposium on Computer Architecture and High Performance Computing Workshop.

[5]  Luiz Marcos Garcia Gonçalves,et al.  Towards green data centers: A comparison of x86 and ARM architectures power efficiency , 2012, J. Parallel Distributed Comput..

[6]  Timothy R. Anderson,et al.  Technological Forecasting of Supercomputer Development: The March to Exascale Computing , 2015 .

[7]  Michael Johan Kruger,et al.  Building a Parallella board cluster , 2015 .

[8]  Margaret H. Wright,et al.  The opportunities and challenges of exascale computing , 2010 .

[9]  Enrico Valdani,et al.  A Practical Approach to Big Data in Tourism: A Low Cost Raspberry Pi Cluster , 2015, ENTER.

[10]  William J. Dally,et al.  Scaling the Power Wall: A Path to Exascale , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[11]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[12]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[13]  Joseph A. Driscoll,et al.  A low-cost computer cluster for high-performance computing education , 2014, IEEE International Conference on Electro/Information Technology.

[14]  Anne E. Trefethen,et al.  Energy-aware software: Challenges, opportunities and strategies , 2013, J. Comput. Sci..

[15]  Fung Po Tso,et al.  The Glasgow Raspberry Pi Cloud: A Scale Model for Cloud Computing Infrastructures , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems Workshops.

[16]  Mateo Valero,et al.  Supercomputing with commodity CPUs: Are mobile SoCs ready for HPC? , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[17]  Steven J. Johnston,et al.  Iridis-pi: a low-cost, compact demonstration cluster , 2014, Cluster Computing.

[18]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[19]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .