Pack Instruction Generation for Media Processors Using Multi-valued Decision Diagram

SIMD instructions are often implemented in modern multimedia oriented processors. Although SIMD instructions are useful for many digital signal processing applications, most compilers do not exploit SIMD instructions. The difficulty in the utilization of SIMD instructions stems from data parallelism in registers. In assembly code generation, the positions of data in registers must be noted. A technique of generating pack instructions which pack or reorder data in registers is essential for exploitation of SIMD instructions. This paper presents a code generation technique for SIMD instructions with pack instructions. SIMD instructions are generated by finding and grouping the same operations in programs. After the SIMD instruction generation, pack instructions are generated. In the pack instruction generation, Multi-valued Decision Diagram (MDD) is introduced to represent and to manipulate sets of packed data. Experimental results show that our code generation technique can generate assembly code with SIMD and pack instructions performing complex repacking of 8 packed data in registers for a commercial VLIW processor with 6 pack instructions and achieved speedup ratio of up to 7.7.

[1]  Peter Kogge,et al.  Generation of permutations for SIMD processors , 2005, LCTES '05.

[2]  Dr. Rainer Leupers Code Optimization Techniques for Embedded Processors , 2000, Springer US.

[3]  Ken-ichi Maeda,et al.  Development of Image Recognition Processor Based on Configurable Processor , 2005, J. Robotics Mechatronics.

[4]  Saman P. Amarasinghe,et al.  Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.

[5]  Peng Wu,et al.  Vectorization for SIMD architectures with alignment constraints , 2004, PLDI '04.

[6]  Alfred V. Aho,et al.  Code generation using tree matching and dynamic programming , 1989, ACM Trans. Program. Lang. Syst..

[7]  Aart J. C. Bik,et al.  Automatic Intra-Register Vectorization for the Intel® Architecture , 2002, International Journal of Parallel Programming.

[8]  Robert K. Brayton,et al.  Algorithms for discrete function manipulation , 1990, 1990 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers.