Superword level parallelism aware word length optimization

Many embedded processors do not support floating-point arithmetic in order to comply with strict cost and power consumption constraints. But, they generally provide support for SIMD as a mean to improve performance for little cost overhead. Achieving good performance when targeting such processors requires the use of fixed-point arithmetic and efficient exploitation of SIMD data-path. To reduce time-to-market, automatic SIMDization — such as superword level parallelism (SLP) extraction — and float-to-fixed-point conversion methodologies have been proposed. In this paper we show that applying these transformations independently is not efficient. We propose a SLP-aware word length optimization algorithm to jointly perform float-to-fixed-point conversion and SLP extraction. We implement the proposed approach in a source-to-source compiler framework and evaluate it on several embedded processors. Experimental results illustrate the validity of our approach.

[1]  Wonyong Sung,et al.  Word-length optimization for high-level synthesis of digital signal processing systems , 1998, 1998 IEEE Workshop on Signal Processing Systems. SIPS 98. Design and Implementation (Cat. No.98TH8374).

[2]  Saman P. Amarasinghe,et al.  Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.

[3]  Peter Kogge,et al.  Generation of permutations for SIMD processors , 2005, LCTES '05.

[4]  Gabriel Caffarena,et al.  SQNR Estimation of Fixed-Point DSP Algorithms , 2010, EURASIP J. Adv. Signal Process..

[5]  Emmett Witchel,et al.  Increasing and detecting memory address congruence , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.

[6]  Daniel Ménard,et al.  Automatic evaluation of the accuracy of fixed-point algorithms , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[7]  Henk Corporaal,et al.  Floating Point to Fixed Point Conversion of C Code , 1999, CC.

[8]  Daniel Ménard,et al.  Novel algorithms for word-length optimization , 2011, 2011 19th European Signal Processing Conference.

[9]  Jaewook Shin,et al.  Superword-level parallelism in the presence of control flow , 2005, International Symposium on Code Generation and Optimization.

[10]  Wonyong Sung,et al.  AUTOSCALER for C: an optimizing floating-point to integer C program converter for fixed-point digital signal processors , 2000 .

[11]  Octavio Nieto-Taladriz,et al.  Fast and accurate computation of the roundoff noise of linear time-invariant systems , 2008, IET Circuits Devices Syst..

[12]  Heinrich Meyr,et al.  FRIDGE: a fixed-point design and simulation environment , 1998, Proceedings Design, Automation and Test in Europe.

[13]  Mahmut T. Kandemir,et al.  A compiler framework for extracting superword level parallelism , 2012, PLDI '12.

[14]  Daniel Ménard,et al.  Floating-to-Fixed-Point Conversion for Digital Signal Processors , 2006, EURASIP J. Adv. Signal Process..