ZAKI: A Smart Method and Tool for Automatic Performance Optimization of Parallel SpMV Computations on Distributed Memory Machines

SpMV is a vital computing operation of many scientific, engineering, economic and social applications, increasingly being used to develop timely intelligence for the design and management of smart societies. Several factors affect the performance of SpMV computations, such as matrix characteristics, storage formats, software and hardware platforms. The complexity of the computer systems is on the rise with the increasing number of cores per processor, different levels of caches, processors per node and high speed interconnect. There is an ever-growing need for new optimization techniques and efficient ways of exploiting parallelism. In this paper, we propose ZAKI, a data-driven, machine-learning approach and tool, to predict the optimal number of processes for SpMV computations of an arbitrary sparse matrix on a distributed memory machine. The aim herein is to allow application scientists to automatically obtain the best configuration, and hence the best performance, for the execution of SpMV computations. We train and test the tool using nearly 2000 real world matrices obtained from 45 application domains including computational fluid dynamics (CFD), computer vision, and robotics. The tool uses three machine learning methods, decision trees, random forest, gradient boosting, and is evaluated in depth. A discussion on the applicability of our proposed tool to energy efficiency optimization of SpMV computations is given. This is the first work where the sparsity structure of matrices have been exploited to predict the optimal number of processes for a given matrix in distributed memory environments by using different base and ensemble machine learning methods.

[1]  Samuel N. Kamin,et al.  Autotuning Runtime Specialization for Sparse Matrix-Vector Multiplication , 2016, ACM Trans. Archit. Code Optim..

[2]  Roberto Tagliaferri,et al.  Decision Trees and Random Forests , 2019, Encyclopedia of Bioinformatics and Computational Biology.

[3]  Marimuthu Palaniswami,et al.  Cloud-Enhanced Robotic System for Smart City Crowd Control , 2016, J. Sens. Actuator Networks.

[4]  Rashid Mehmood,et al.  Exploring the influence of big data on city transport operations: a Markovian approach , 2017 .

[5]  Jon Crowcroft,et al.  Parallel iterative solution method for large sparse linear equation systems , 2005 .

[6]  Rashid Mehmood,et al.  A Quantitative Model of Grid Systems Performance in Healthcare Organisations , 2010, 2010 International Conference on Intelligent Systems, Modelling and Simulation.

[7]  Rashid Mehmood,et al.  Parallel Sparse Matrix Vector Multiplication on Intel MIC: Performance Analysis , 2017 .

[8]  Daniel G. Aliaga,et al.  3D Design and Modeling of Smart Cities from a Computer Graphics Perspective , 2012 .

[9]  Mahmut Kandemir,et al.  Optimizing sparse matrix vector multiplication on emerging multicores , 2013, 2013 IEEE 6th International Workshop on Multi-/Many-core Computing Systems (MuCoCoS).

[10]  Rashid Mehmood,et al.  Parallel Shortest Path Graph Computations of United States Road Network Data on Apache Spark , 2017 .

[11]  Rashid Mehmood,et al.  Enabling Next Generation Logistics and Planning for Smarter Societies , 2017, ANT/SEIT.

[12]  Arjan Durresi,et al.  A survey: Control plane scalability issues and approaches in Software-Defined Networking (SDN) , 2017, Comput. Networks.

[13]  Rashid Mehmood,et al.  Analysis of Tweets in Arabic Language for Detection of Road Traffic Conditions , 2017 .

[14]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[15]  Yi Zhang,et al.  Dual-processor parallelisation of symbolic probabilistic model checking , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..

[16]  Luis Salgado,et al.  Special issue on real-time computer vision in smart cities , 2014, Journal of Real-Time Image Processing.

[17]  Rashid Mehmood,et al.  Computational Markovian analysis of large systems , 2011 .

[18]  Rashid Mehmood,et al.  Big data logistics: a health-care transport capacity sharing model , 2015 .

[19]  Enrique S. Quintana-Ortí,et al.  Performance and Energy-Aware Characterization of the Sparse Matrix-Vector Multiplication on Multithreaded Architectures , 2014, 2014 43rd International Conference on Parallel Processing Workshops.

[20]  Yunhao Liu,et al.  Big Data: A Survey , 2014, Mob. Networks Appl..

[21]  James Demmel,et al.  the Parallel Computing Landscape , 2022 .

[22]  Hai Jin,et al.  Optimization of Sparse Matrix-Vector Multiplication with Variant CSR on GPUs , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.

[23]  Rashid Mehmood,et al.  Towards a Semantically Enriched Computational Intelligence (SECI) Framework for Smart Farming , 2017 .

[24]  Mario A. Bochicchio,et al.  Crowd-sensing our Smart Cities: a Platform for Noise Monitoring and Acoustic Urban Planning , 2017 .

[25]  Juan Manuel Cueva Lovelle,et al.  Midgar: Detection of people through computer vision in the Internet of Things scenarios to improve the security in Smart Cities, Smart Towns, and Smart Homes , 2017, Future Gener. Comput. Syst..

[26]  Murat Akcin,et al.  Opportunities for energy efficiency in smart cities , 2016, 2016 4th International Istanbul Smart Grid Congress and Fair (ICSG).

[27]  Rashid Mehmood,et al.  Enabling Smarter Societies through Mobile Big Data Fogs and Clouds , 2017, ANT/SEIT.

[28]  Rashid Mehmood,et al.  UTiLearn: A Personalised Ubiquitous Teaching and Learning System for Smart Societies , 2017, IEEE Access.

[29]  Nectarios Koziris,et al.  A lightweight optimization selection method for Sparse Matrix-Vector Multiplication , 2015, ArXiv.

[30]  Vivek Sarkar,et al.  A survey of sparse matrix-vector multiplication performance on large matrices , 2016, ArXiv.

[31]  Takashi Gojobori,et al.  DNA Profiling Methods and Tools: A Review , 2017 .

[32]  Rashid Mehmood,et al.  Big Data and HPC Convergence: The Cutting Edge and Outlook , 2017 .

[33]  Feng Shi,et al.  Machine Learning Approach for the Predicting Performance of SpMV on GPU , 2016, 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS).

[34]  Hang Cui,et al.  A Machine Learning-Based Approach for Selecting SpMV Kernels and Matrix Storage Formats , 2018, IEICE Trans. Inf. Syst..

[35]  Wu-chun Feng,et al.  Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[36]  Carlos Alberto Ochoa Ortíz Zezzatti,et al.  Smart City Visualization Tool for the Open Data Georeferenced Analysis Utilizing Machine Learning , 2018, Int. J. Comb. Optim. Probl. Informatics.

[37]  Roger Schaer,et al.  Computational Fluid Dynamics as a tool to predict the air pollution dispersion in a neighborhood - A research project to improve the quality of life in cities. , 2016 .

[38]  J. Elmirghani,et al.  A data Mirroring technique for SANs in a Metro WDM sectioned ring , 2008, 2008 International Conference on Optical Network Design and Modeling.

[39]  Rudolf Eigenmann,et al.  Adaptive runtime tuning of parallel sparse matrix-vector multiplication on distributed memory systems , 2008, ICS '08.

[40]  Marta Kwiatkowska,et al.  An Efficient BDD-Based Implementation of Gauss-Seidel for CTMC Analysis , 2003 .

[41]  Rashid Mehmood,et al.  Disaster Management in Smart Cities by Forecasting Traffic Plan Using Deep Learning and GPUs , 2017 .

[42]  Adil Rasheed,et al.  Methodology for assessing cycling comfort during a smart city development , 2017 .

[43]  P. Sadayappan,et al.  Effective Machine Learning Based Format Selection and Performance Modeling for SpMV on GPUs , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[44]  Thomas B. Moeslund,et al.  Thermal imaging systems for real-time applications in smart cities , 2016, Int. J. Comput. Appl. Technol..

[45]  Sherali Zeadally,et al.  Multimedia applications over metropolitan area networks (MANs) , 2011, J. Netw. Comput. Appl..

[46]  Justin Salamon,et al.  Sound analysis in smart cities , 2018 .

[47]  Rashid Mehmood,et al.  SURAA: A Novel Method and Tool for Loadbalanced and Coalesced SpMV Computations on GPUs , 2019 .

[48]  Rashid Mehmood,et al.  Automatic Event Detection in Smart Cities Using Big Data Analytics , 2017 .

[49]  Zheng Wang,et al.  Optimizing Sparse Matrix-Vector Multiplication on Emerging Many-Core Architectures , 2018, ArXiv.

[50]  Rashid Mehmood,et al.  Increasing Sustainability of Road Transport in European Cities and Metropolitan Areas by Facilitating Autonomic Road Transport Systems (ARTS) , 2014 .

[51]  Rashid Mehmood,et al.  D2TFRS: An Object Recognition Method for Autonomous Vehicles Based on RGB and Spatial Values of Pixels , 2017 .