Performance is of utmost importance for linear algebra libraries since they usually are the core of numerical and simulation packages and use most of the available compute time and resources. However, especially in large scale simulation frameworks the readability and ease of use of mathematical expressions is essential for a continuous maintenance, modification, and extension of the software framework. Based on these requirements, in the last decade C++ Expression Templates have gained a reputation as a suitable means to combine an elegant, domain-specific, and intuitive user interface with “HPC-grade” performance. Unfortunately, many of the available ET-based frameworks fall short of the expectation to deliver high performance, adding to the general mistrust towards C++ math libraries. In this paper we present performance results for Smart Expression Template libraries, demonstrating that by proper combination of high-level C++ code and low-level compute kernels both requirements, an elegant interface and high performance, can be achieved.
[1]
Ulrich Rüde,et al.
Expression Templates Revisited: A Performance Analysis of Current Methodologies
,
2011,
SIAM J. Sci. Comput..
[2]
Georg Hager,et al.
Introducing a Performance Model for Bandwidth-Limited Loop Kernels
,
2009,
PPAM.
[3]
David Abrahams,et al.
C++ template metaprogramming
,
2005
.
[4]
Georg Hager,et al.
Performance limitations for sparse matrix-vector multiplications on current multicore environments
,
2009,
ArXiv.
[5]
Todd L. Veldhuizen.
Just When You Thought Your Little Language Was Safe: "Expression Templates" in Java
,
2000,
GCSE.
[6]
Todd L. Veldhuizen,et al.
Expression templates
,
1996
.