Single instruction multiple data code auto generation for a very long instruction words digital signal processor in sensor-based systems

The emerging applications have imposed strong requirements, such as high processing capacity, low-power consumption, reduced size and many others, on the sensor-based systems. Owing to their balanced combination of flexibility and hardware performance, digital signal processors (DSPs) have become more and more popular used in sensor-based systems. Many DSPs have adopted very long instruction words (VLIW) style architecture, for its ability to greatly enhance instruction level parallelism. However, as VLIW codes are statically scheduled, the behaviour of VLIW architecture is dominated by the efficiency of its compiler. Single instruction multiple data (SIMD) instructions, which perform multiple operations in parallel on multiple data packed in registers, have been widely used in DSPs to meet the requirements of sensor-based systems. Although hand programming still yields the best performing SIMD codes, it is both time consuming and error prone. Advanced compiler techniques to automatically generate SIMD instructions, are under urgent demand. In this study, the authors proposed an SIMD code auto generation approach for VLIW architecture. It recognises candidates of operations in the intermediate representation, evaluates the possibility of grouping them into SIMD code, reconstructs the verified ones according to the cost model and finally generates the SIMD code. The authors have implemented this approach in the compiler of a VLIW DSP named Magnolia, which is designed for sensor-based systems. The results show that the authors’ approach is very efficient, and can largely enhance the performance.

[1]  Taewhan Kim,et al.  Temperature-Aware Compilation for VLIWProcessors , 2007, 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2007).

[2]  Aart J. C. Bik,et al.  Automatic Intra-Register Vectorization for the Intel® Architecture , 2002, International Journal of Parallel Programming.

[3]  Franz Franchetti,et al.  A SIMD vectorizing compiler for digital signal processing algorithms , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[4]  Andreas Krall,et al.  Pointer Alignment Analysis for Processors with SIMD Instructions , 2003 .

[5]  Saman P. Amarasinghe,et al.  Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.

[6]  Peng Wu,et al.  Vectorization for SIMD architectures with alignment constraints , 2004, PLDI '04.

[7]  Yoshinori Takeuchi,et al.  A new compilation technique for SIMD code generation across basic block boundaries , 2010, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC).

[8]  Richard Henderson,et al.  Multi-platform auto-vectorization , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[9]  Rainer Leupers,et al.  Retargetable code optimization with SIMD instructions , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).

[10]  Peng Wu,et al.  Efficient SIMD code generation for runtime alignment and length conversion , 2005, International Symposium on Code Generation and Optimization.

[11]  R. Leupers Code selection for media processors with SIMD instructions , 2000, Proceedings Design, Automation and Test in Europe Conference and Exhibition 2000 (Cat. No. PR00537).

[12]  Gerhard Fettweis,et al.  Automatic Code Generation for SIMD DSP Architectures: An Algebraic Approach , 2004 .

[13]  Peter Kogge,et al.  Generation of permutations for SIMD processors , 2005, LCTES '05.

[14]  Xu Yang,et al.  An Advanced Compiler Designed for a VLIW DSP for Sensors-Based Systems , 2012, Sensors.

[15]  Ayal Zaks,et al.  Auto-vectorization of interleaved data for SIMD , 2006, PLDI '06.