Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Alogorithms

SPIRAL is a generator for libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be re-optimized as the hardware is upgraded or replaced. This paper describes the main components of SPIRAL: the mathematical framework that concisely describes signal transforms and their fast algorithms; the formula generator that captures at the algorithmic level the degrees of freedom in expressing a particular signal processing transform; the formula translator that encapsulates the compilation degrees of freedom when translating a specific algorithm into an actual code implementation; and, finally, an intelligent search engine that finds within the large space of alternative formulas and implementations the “best” match to the given computing platform. We present empirical data that demonstrate the high performance of SPIRAL generated code.

[1]  N. Ahmed,et al.  FAST TRANSFORMS, algorithms, analysis, applications , 1983, Proceedings of the IEEE.

[2]  David Padua,et al.  Automatic Optimization of DSP Algorithms , 2001 .

[3]  Viktor K. Prasanna,et al.  Dynamic data layouts for cache-conscious factorization of DFT , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[4]  Viktor K. Prasanna,et al.  Cache conscious Walsh-Hadamard transform , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  Manuela M. Veloso,et al.  Automating the modeling and optimization of the performance of signal transforms , 2002, IEEE Trans. Signal Process..

[6]  György E. Révész Introduction to formal languages , 1983 .

[7]  Steven G. Johnson,et al.  FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  José M. F. Moura,et al.  Fast automatic software implementations of FIR filters , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[9]  K. R. Rao,et al.  Techniques and Standards for Image, Video, and Audio Coding , 1996 .

[10]  Dragan Mirkovic,et al.  Automatic Performance Tuning in the UHFFT Library , 2001, International Conference on Computational Science.

[11]  David A. Padua,et al.  SPL: a language and compiler for DSP algorithms , 2001, PLDI '01.

[12]  Matteo Frigo,et al.  A fast Fourier transform compiler , 1999, SIGP.

[13]  M. Vetterli,et al.  Simple FFT and DCT algorithms with reduced number of operations , 1984 .

[14]  Markus Püschel,et al.  In search of the optimal Walsh-Hadamard transform , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[15]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[16]  Franz Franchetti,et al.  Architecture independent short vector FFTs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[17]  Kang Su Gatlin,et al.  Architecture-Cognizant Divide and Conquer Algorithms , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[18]  Franz Franchetti,et al.  Short vector code generation for the discrete Fourier transform , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[19]  Jack J. Dongarra,et al.  Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[20]  R. C. Whaley,et al.  Automatically Tuned Linear Algebra Software (ATLAS) , 2011, Encyclopedia of Parallel Computing.

[21]  B. Singer,et al.  Stochastic Search for Signal Processing Algorithm Optimization , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[22]  R. W. Johnson,et al.  A methodology for designing, modifying, and implementing Fourier transform algorithms on various architectures , 1990 .

[23]  Zhongde Wang Fast algorithms for the discrete W transform and for the discrete Fourier transform , 1984 .

[24]  Katherine A. Yelick,et al.  Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY , 2001, International Conference on Computational Science.

[25]  R. Tolimieri,et al.  Algorithms for Discrete Fourier Transform and Convolution , 1989 .

[26]  Sebastian Egner,et al.  Zur algorithmischen Zerlegungstheorie linearer Transformationen mit Symmetrie , 1997 .

[27]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[28]  B. Hunt,et al.  The discreteW transform , 1985 .

[29]  Franz Franchetti,et al.  A SIMD vectorizing compiler for digital signal processing algorithms , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[30]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[31]  Charles M. Rader,et al.  Fast transforms: Algorithms, analyses, applications , 1984 .

[32]  C. Sidney Burrus,et al.  The design of optimal DFT algorithms using dynamic programming , 1982, ICASSP.

[33]  David Sepiashvili,et al.  Performance Models and Search Methods for Optimal FFT Implementations , 2006 .