The development of a data-driven application benchmarking approach to performance modelling

Performance modelling is a useful tool in the lifeycle of high performance scientific software, such as weather and climate models, especially as a means of ensuring efficient use of available computing resources. In particular, sufficiently accurate performance prediction could reduce the effort and experimental computer time required when porting and optimising a climate model to a new machine. Yet as architectures become more complex, performance prediction is becoming more difficult. Traditional methods of performance prediction, based on source code analysis and supported by machine benchmarks, are proving inadequate to the task. In this paper, the reasons for this are explored by applying some traditional techniques to predict the computation time of a simple shallow water model which is illustrative of the computation (and communication) involved in climate models. These models are compared with real execution data gathered on AMD Opteron-based systems, including several phases of the U.K. academic community HPC resource, HECToR. Some success is had in relating source code to achieved performance for the K10 series of Opterons, but the method is found to be inadequate for the next-generation Interlagos processor. The experience leads to the investigation of a data-driven application benchmarking approach to performance modelling. Results for an early version of the approach are presented using the shallow model as an example. In addition, the data-driven approach is compared with a novel analytical model based on fitting logarithmic curves to benchmarked application data. The limitations of this analytical method provide further motivation for the development of the data-driven approach and results of this work have been published elsewhere.

[1]  A. Snavely,et al.  Modeling application performance by convolving machine signatures with application profiles , 2001, Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. WWC-4 (Cat. No.01EX538).

[2]  Michael W. Berry,et al.  Public international benchmarks for parallel computers: PARKBENCH committee: Report-1 , 1994 .

[3]  Jesús Labarta,et al.  A Framework for Performance Modeling and Prediction , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[4]  Annette Osprey,et al.  A benchmark-driven modelling approach for evaluating deployment choices on a multi-core architecture , 2013, PDPTA 2013.

[5]  Pat Conway,et al.  The AMD Opteron Processor for Multiprocessor Servers , 2003, IEEE Micro.

[6]  Darren J. Kerbyson,et al.  A Performance Model of the Parallel Ocean Program , 2005, Int. J. High Perform. Comput. Appl..

[7]  Darren J. Kerbyson,et al.  Performance modeling in action: Performance prediction of a Cray XT4 system during upgrade , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[8]  Samuel Williams,et al.  Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors , 2007, SIAM Rev..

[9]  Torsten Hoefler,et al.  Performance modeling for systematic performance tuning , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[10]  Gerhard Wellein,et al.  Multi-core architectures: Complexities of performance prediction and the impact of cache topology , 2009, ArXiv.

[11]  A. Arakawa Computational design for long-term numerical integration of the equations of fluid motion: two-dimen , 1997 .

[12]  Ken Kennedy,et al.  Estimating Interlock and Improving Balance for Pipelined Architectures , 1988, J. Parallel Distributed Comput..

[13]  R. Sadourny The Dynamics of Finite-Difference Models of the Shallow-Water Equations , 1975 .

[14]  Laura Carrington,et al.  A performance prediction framework for scientific applications , 2003, Future Gener. Comput. Syst..