FOR FAST SCALABLE MULTIMEDIA PROCESSING

PLX is a small, fully subword-parallel instruction set architecture designed for very fast multimedia processing, especially in constrained environments requiring low cost and power such as handheld multimedia information appliances. In PLX, we select the most useful multimedia instructions added previously to microprocessors. We also introduce a few novel features: a new definition of predication requiring very few bits in each predicated instruction, and datapath scalability from 32-bit to 128-bit words, which allows different degrees of subword parallelism without any changes to the ISA. Performance results from basic multimedia kernels testify to PLX’s superiority for multimedia processing.

[1]  Ruby B. Lee Subword permutation instructions for two-dimensional multimedia processing in MicroSIMD architectures , 2000, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors.

[2]  Ruby B. Lee,et al.  Multimedia instructions in ia-64 , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[3]  Gerry Kane,et al.  PA-RISC 2.0 Architecture , 1995 .

[4]  Uri C. Weiser,et al.  MMX technology extension to the Intel architecture , 1996, IEEE Micro.

[5]  Ruby B. Lee,et al.  Cost-effective multiplication with enhanced adders for multimedia applications , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[6]  Ruby B. Lee Subword parallelism with MAX-2 , 1996, IEEE Micro.

[7]  A. Murat Fiskiran,et al.  3 Multimedia Instructions in Microprocessors for Native Signal Processing , 2001 .

[8]  Michael D. Smith,et al.  Geust Editorial: Media processing: a new design target , 1996, IEEE Micro.

[9]  Ruby B. Lee Accelerating multimedia with enhanced microprocessors , 1995, IEEE Micro.

[10]  Ruby B. Lee Multimedia extensions for general-purpose processors , 1997, 1997 IEEE Workshop on Signal Processing Systems. SiPS 97 Design and Implementation formerly VLSI Signal Processing.

[11]  Y. Arai,et al.  A Fast DCT-SQ Scheme for Images , 1988 .