Hardware Support for Arithmetic Units of Processor with Multimedia Extension

Multimedia extension technique is very popular in designing processors to improve multimedia processing performance. This paper carries on the study of hardware implementation of arithmetic units with multimedia extension support, and proposes new design methods for subword multiplier and SIMD (single instruction multiple data) IEEE FPU (floating-point unit). To verify the correctness and effectiveness of these methods, a multimedia coprocessor with SIMD fixed-point and floating-point units is designed. The implemented chip successfully demonstrates that the proposed SIMD Arithmetic units get good tradeoff between cost and performance.

[1]  S.M. Mueller,et al.  A dual mode IEEE multiplier , 1997, 1997 Proceedings Second Annual IEEE International Conference on Innovative Systems in Silicon.

[2]  Michael J. Liebelt,et al.  Multiple-precision fixed-point vector multiply-accumulator using shared segmentation , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[3]  K. Suzuki,et al.  A 2000-MOPS embedded RISC processor with a Rambus DRAM controller , 1999 .

[4]  Michael J. Schulte,et al.  Flexible arithmetic and logic unit for multimedia processing , 2003, SPIE Optics + Photonics.

[5]  Ville Lappalainen,et al.  Current Research Efforts in Media ISA Development , 2002 .

[6]  Hans G. Kerkhoff,et al.  Design and test space exploration of transport-triggered architectures , 2000, DATE '00.

[7]  Michael J. Schulte,et al.  A quadruple precision and dual double precision floating-point multiplier , 2003, Euromicro Symposium on Digital System Design, 2003. Proceedings..

[8]  Peng Wu,et al.  Design and exploitation of a high-performance SIMD floating-point unit for Blue Gene/L , 2005, IBM J. Res. Dev..

[9]  Javier D. Bruguera,et al.  Leading-One Prediction with Concurrent Position Correction , 1999, IEEE Trans. Computers.

[10]  Michael J. Schulte,et al.  Multiplier architectures for media processing , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[11]  Mateo Valero,et al.  DLP+TLP processors for the next generation of media workloads , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[12]  Neil Burgess PAPA - packed arithmetic on a prefix adder for multimedia applications , 2002, Proceedings IEEE International Conference on Application- Specific Systems, Architectures, and Processors.

[13]  Ansi Ieee,et al.  IEEE Standard for Binary Floating Point Arithmetic , 1985 .

[14]  Sang H. Dhong,et al.  The vector floating-point unit in a synergistic processor element of a CELL processor , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).

[15]  Vojin G. Oklobdzija,et al.  General data-path organization of a MAC unit for VLSI implementation of DSP processors , 1998, ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187).