Comparison of Analytical and Empirical Performance Models: A Case Study on Multigrid Systems

Runtime complexity of software can be described by the O-notation. However, this is only a theoretical indicator and cannot be used as an indicator for the actual runtime of a software. For the approximation of the runtime of a certain software, so-called performance models are used, which can be categorized in analytical and empirical performance models. While analytical performance models are created using domain knowledge, empirical performance models are obtained using machine-learning strategies. In this work, we compare two analytical and empirical performance models for two multigrid systems, SMG2000 and BoomerAMG. The empirical performance models are generated by the tool SPL Conqueror using different sampling heuristics to define the learning set. To allow the comparison between these kinds of performance models, we propose two different comparison strategies, which we evaluate and discuss in this thesis. For one comparison strategy, we investigate on different distance and similarity measures. We observe that the selection of the sampling heuristic has an influence on both comparison strategies. Furthermore, we notice that both comparison strategies as well as the results of the distance and similarity measures differ from each other.

[1]  Robert D. Falgout,et al.  Coarse-Grid Selection for Parallel Algebraic Multigrid , 1998, IRREGULAR.

[2]  L. R. Scott,et al.  The Mathematical Theory of Finite Element Methods , 1994 .

[3]  V. E. Henson,et al.  BoomerAMG: a parallel algebraic multigrid solver and preconditioner , 2002 .

[4]  R. LeVeque Wave Propagation Algorithms for Multidimensional Hyperbolic Systems , 1997 .

[5]  Changsheng Chen,et al.  An Unstructured Grid, Finite-Volume, Three-Dimensional, Primitive Equations Ocean Model: Application to Coastal Ocean and Estuaries , 2003 .

[6]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Robert D. Falgout,et al.  hypre: A Library of High Performance Preconditioners , 2002, International Conference on Computational Science.

[8]  Michael Jackson,et al.  Optimal Design of Experiments , 1994 .

[9]  Marzena Kryszkiewicz,et al.  The Cosine Similarity in Terms of the Euclidean Distance , 2014 .

[10]  Gunter Saake,et al.  SPL Conqueror: Toward optimization of non-functional properties in software product lines , 2012, Software Quality Journal.

[11]  Li Bai,et al.  Cosine Similarity Metric Learning for Face Verification , 2010, ACCV.

[12]  Allen D. Malony,et al.  The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..

[13]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[14]  Torsten Hoefler,et al.  Catwalk: A Quick Development Path for Performance Models , 2014, Euro-Par Workshops.

[15]  Sven Apel,et al.  Performance-influence models for highly configurable systems , 2015, ESEC/SIGSOFT FSE.

[16]  Adolfy Hoisie,et al.  Palm: easing the burden of analytical performance modeling , 2014, ICS '14.

[17]  Deniz Bingöl,et al.  Brilliant Yellow dye adsorption onto sepiolite using a full factorial design , 2010 .

[18]  R. Plackett,et al.  THE DESIGN OF OPTIMUM MULTIFACTORIAL EXPERIMENTS , 1946 .

[19]  Paolo Romano,et al.  Enhancing Performance Prediction Robustness by Combining Analytical Modeling and Machine Learning , 2015, ICPE.

[20]  D. Bartuschat Algebraic Multigrid , 2007 .

[21]  Robert D. Falgout,et al.  Semicoarsening Multigrid on Distributed Memory Machines , 1999, SIAM J. Sci. Comput..

[22]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[23]  Heinz Mühlenbein,et al.  The parallel genetic algorithm as function optimizer , 1991, Parallel Comput..

[24]  Zeshui Xu,et al.  Distance and similarity measures for hesitant fuzzy sets , 2011, Inf. Sci..

[25]  Sung-Hyuk Cha Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions , 2007 .

[26]  Jack J. Dongarra,et al.  The LINPACK Benchmark: past, present and future , 2003, Concurr. Comput. Pract. Exp..

[27]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[28]  Sven Apel,et al.  Performance Prediction of Multigrid-Solver Configurations , 2016, Software for Exascale Computing.

[29]  Paul M. de Zeeuw,et al.  Development of semi-coarsening techniques , 1996 .

[30]  T. Weiland,et al.  Impact of the displacement current on low-frequency electromagnetic fields computed using high-resolution anatomy models , 2005, Physics in medicine and biology.

[31]  Hyesoon Kim,et al.  An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.

[32]  Remco C. Veltkamp,et al.  A Pseudo-Metric for Weighted Point Sets , 2002, ECCV.

[33]  Jianling Sun,et al.  An analytical performance model of MapReduce , 2011, 2011 IEEE International Conference on Cloud Computing and Intelligence Systems.

[34]  T. Harter,et al.  Parallel simulation of groundwater non-point source pollution using algebraic multigrid preconditioners , 2014, Computational Geosciences.

[35]  Ronald Rousseau,et al.  Similarity measures in scientometric research: The Jaccard index versus Salton's cosine formula , 1989, Inf. Process. Manag..

[36]  Martin Schulz,et al.  Modeling the performance of an algebraic multigrid cycle on HPC platforms , 2011, ICS '11.

[37]  Matthias Hauswirth,et al.  Producing wrong data without doing anything obviously wrong! , 2009, ASPLOS.

[38]  Y. C. Tay,et al.  Analytical Performance Modeling for Computer Systems , 2010, Analytical Performance Modeling for Computer Systems.

[39]  Paolo Romano,et al.  Hybrid Machine Learning/Analytical Models for Performance Prediction: A Tutorial , 2015, ICPE.