A Domain-Specific Language for the Generation of Optimized SIMD-Parallel Assembly Code

We present a domain-specific language embedded into Haskell that allows mathematicians to formulate novel high-performance SIMD-parallel algorithms for the evaluation of special functions. Developing such functions involves explorations both of mathematical properties of the functions which lead to effective (rational) polynomial approximations, and of specific properties of the binary representation of floating point numbers. Our framework includes support for estimating the effectiveness of different approximation schemes in Maple. Once a scheme is chosen, the Maple-generated component is integrated into the code generation setup. Numerical experimentation can then be performed interactively, with support functions for running standard tests and tabulating results. Once a satisfactory formulation is achieved, a codegraph representation of the algorithm can be passed to other components which produce C function bodies, or to a state-of-the-art scheduler which produces optimal or near-optimal schedules, currently targetting the “Cell Broadband Engine” processor. Encapsulating a considerable amount of knowledge about specific “tricks” in DSL constructs allows us produce algorithm specifications that are precise, readable, and compile to optimal-quality assembly code, while formulations of the equivalent algorithms in C would be almost impossible to understand and maintain. General Terms Domain-specific languages, Synthesis from specifications, Industrial applications

[1]  Javier Hormigo,et al.  Evaluation of elementary functions using multimedia features , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[2]  Paul Hudak,et al.  Building domain-specific embedded languages , 1996, CSUR.

[3]  Donald E. Knuth,et al.  Literate Programming , 1984, Comput. J..

[4]  L. B. Smith,et al.  Algorithm 414: Chebyshev approximation of continuous functions by a Chebyshev system of functions , 1967, CACM.

[5]  Andrew Lumsdaine,et al.  Runtime synthesis of high-performance code from scripting languages , 2006, OOPSLA '06.

[6]  Clemens Grelck,et al.  SAC—A Functional Array Language for Efficient Multi-threaded Execution , 2006, International Journal of Parallel Programming.

[7]  C. Pollard,et al.  Center for the Study of Language and Information , 2022 .

[8]  Eugenio Moggi A Modular Approach to Denotational Semantics , 1991, Category Theory and Computer Science.

[9]  Wolfgang Thaller Explicitly Staged Software Pipelining , 2006 .

[10]  Andrew Lumsdaine,et al.  Expression and Loop Libraries for High-Performance Code Synthesis , 2006, LCPC.

[11]  Ruby B. Lee,et al.  Bit permutation instructions for accelerating software cryptography , 2000, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors.

[12]  Philip Wadler,et al.  Comprehending monads , 1990, Mathematical Structures in Computer Science.

[13]  Eugenio Moggi,et al.  Notions of Computation and Monads , 1991, Inf. Comput..

[14]  Paul Hudak,et al.  Modular domain specific languages and tools , 1998, Proceedings. Fifth International Conference on Software Reuse (Cat. No.98TB100203).

[15]  Jacques Carette,et al.  Control-Flow Semantics for Assembly-Level Data-Flow Graphs , 2005, RelMiCS.

[16]  A. Karimi,et al.  Master‟s thesis , 2011 .

[17]  Jean-Michel Muller,et al.  Fast evaluation of polynomials and inverses of polynomials , 1993, Proceedings of IEEE 11th Symposium on Computer Arithmetic.

[18]  Philip Wadler,et al.  The essence of functional programming , 1992, POPL '92.