2.44-GFLOPS 300-MHz floating-point vector-processing unit for high-performance 3D graphics computing

A vector unit for high-performance three-dimensional graphics computing has been developed. We implement four floating-point multiply-accumulate units, which execute multiply-add operations with one throughput; one floating-point divide/square root unit, which executes division and square-root operations with six cycles at 300 MHz; and one vector general-purpose register file, which has 128 bits/spl times/32 words. The parallel execution of all units delivers a peak performance of 2.44 GFLOPS at 300 MHz.

[1]  Yukio Endo,et al.  2.44 GFLOPS 300MHz floating–point vector processing unit for high performance 3D graphics computing , 1999, Proceedings of the 25th European Solid-State Circuits Conference.

[2]  C. Heikes,et al.  A dual floating point coprocessor with an FMAC architecture , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.

[3]  Hiroaki Suzuki,et al.  Leading-zero anticipatory logic for high-speed floating point addition , 1995, Proceedings of the IEEE 1995 Custom Integrated Circuits Conference.

[4]  T. Sakamoto,et al.  A high bandwidth superscalar microprocessor for multimedia applications , 1999, 1999 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC. First Edition (Cat. No.99CH36278).

[5]  Masaaki Oka,et al.  Vector Unit Architecture for Emotion Synthesis , 2000, IEEE Micro.

[6]  K. Kutaragi,et al.  A microprocessor with a 128 b CPU, 10 floating-point MACs, 4 floating-point dividers, and an MPEG2 decoder , 1999, 1999 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC. First Edition (Cat. No.99CH36278).

[7]  Gensoh Matsubara,et al.  30-ns 55-b shared radix 2 division and square root using a self-timed circuit , 1995, Proceedings of the 12th Symposium on Computer Arithmetic.

[8]  Erdem Hokenek,et al.  Design of the IBM RISC System/6000 Floating-Point Execution Unit , 1990, IBM J. Res. Dev..

[9]  H. Murakami,et al.  300 MHz design methodology of VU for emotion synthesis , 2000, Proceedings 2000. Design Automation Conference. (IEEE Cat. No.00CH37106).

[10]  A. Olesin,et al.  A dual execution pipelined floating-point CMOS processor , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.