Efficient Distribution and Processing of Data for Parallelizing Data Mining in Mobile Clouds

We study different kinds of data distributions for improving the efficient, parallelized implementation of data mining in mobile cloud systems. Our formally-based approach ensures the correctness of the obtained parallel implementation. We apply our approach to parallel implementation of data mining algorithms in systems where a cloud is accessed via a mobile (wireless) network. Our approach derives a parallel implementation of a data mining algorithm that performs as much as possible computations at local servers of a mobile network, rather than transferring data for processing to a high-performance cluster in the cloud as it is done in the current cloud systems based on MapReduce. We implement our approach by extending the Java-based library DXelopes, and we illustrate our results with the popular data-mining Normal Bayes classifier training algorithm. Our experiments on real-world data sets confirm that our approach significantly reduces the network traffic and the application run time.

[1]  Siddharth Patwardhan,et al.  Question analysis: How Watson reads a clue , 2012, IBM J. Res. Dev..

[2]  Allen,et al.  Optimizing Compilers for Modern Architectures , 2004 .

[3]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[4]  Sergei Gorlatch,et al.  A formally based parallelization of data mining algorithms for multi-core systems , 2018, The Journal of Supercomputing.

[5]  Raja Lavanya,et al.  Fog Computing and Its Role in the Internet of Things , 2019, Advances in Computer and Electrical Engineering.

[6]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[7]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[8]  Marcos Dias de Assunção,et al.  Apache Spark , 2019, Encyclopedia of Big Data Technologies.

[9]  Chonho Lee,et al.  A survey of mobile cloud computing: architecture, applications, and approaches , 2013, Wirel. Commun. Mob. Comput..