Low-Power Leading-Zero Counting and Anticipation Logic for High-Speed Floating Point Units

In this paper, a new leading-zero counter (or detector) is presented. New boolean relations for the bits of the leading-zero count are derived that allow their computation to be performed using standard carry-lookahead techniques. Using the proposed approach various design choices can be explored and different circuit topologies can be derived for the design of the leading-zero counting unit. The new circuits can be efficiently implemented either in static or in dynamic logic and require significantly less energy per operation compared to the already known architectures. The integration of the proposed leading-zero counter with the leading-zero anticipation logic is analyzed and the most efficient combination is identified. Finally, a simple yet efficient technique for handling the error of the leading-zero anticipation logic is also presented. The energy-delay behavior of the proposed circuits has been investigated using static and dynamic CMOS implementations in a 130-nm CMOS technology.

[1]  Cheng-Chew Lim,et al.  Reduced latency IEEE floating-point standard adder architectures , 1999, Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336).

[2]  Sang H. Dhong,et al.  The vector floating-point unit in a synergistic processor element of a CELL processor , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).

[3]  Michael J. Flynn,et al.  ONE PREDICTION-IMPLEMENTATION , GENERALIZATION , AND APPLICATION , 1998 .

[4]  David Harris,et al.  CMOS VLSI Design: A Circuits and Systems Perspective , 2004 .

[5]  Guenter Gerwig,et al.  Floating-point unit in standard cell design with 116 bit wide dataflow , 1999, Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336).

[6]  M.A. Horowitz,et al.  Skew-tolerant domino circuits , 1997, 1997 IEEE International Solids-State Circuits Conference. Digest of Technical Papers.

[7]  Naraig Manjikian,et al.  Enhanced Architectural Support for Variable-Length Decoding , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[8]  Guido D. Salvucci,et al.  Ieee standard for binary floating-point arithmetic , 1985 .

[9]  Giorgos Dimitrakopoulos,et al.  High-speed parallel-prefix VLSI Ling adders , 2005, IEEE Transactions on Computers.

[10]  David Harris,et al.  An exponentiation unit for an OpenGL lighting engine , 2004, IEEE Transactions on Computers.

[11]  K. Soumyanath,et al.  Sub-500-ps 64-b ALUs in 0 . 18-m SOI / Bulk CMOS : Design and Scaling Trends , 2001 .

[12]  Tadashi Sumi,et al.  Comments on "Leading-zero anticipatory logic for high-speed floating point addition" [with reply] , 1997 .

[13]  T. Sato,et al.  2.44-GFLOPS 300-MHz floating-point vector-processing unit for high-performance 3D graphics computing , 2000, IEEE Journal of Solid-State Circuits.

[14]  S.H. Dhong,et al.  A fully pipelined single-precision floating-point unit in the synergistic processor element of a CELL processor , 2006, IEEE Journal of Solid-State Circuits.

[15]  Stephen P. Boyd,et al.  Digital Circuit Optimization via Geometric Programming , 2005, Oper. Res..

[16]  Javier D. Bruguera,et al.  Leading-One Prediction with Concurrent Position Correction , 1999, IEEE Trans. Computers.

[17]  K.J. Nowka,et al.  1 GHz leading zero anticipator using independent sign-bit determination logic , 2000, 2000 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.00CH37103).

[18]  Hiroaki Suzuki,et al.  Leading-zero anticipatory logic for high-speed floating point addition , 1995, Proceedings of the IEEE 1995 Custom Integrated Circuits Conference.

[19]  Kevin J. Nowka,et al.  Leading zero anticipation and detection-a comparison of methods , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[20]  Harold S. Stone,et al.  A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations , 1973, IEEE Transactions on Computers.

[21]  Zichu Qi,et al.  A novel design of leading zero anticipation circuit with parallel error detection , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[22]  S. C. Knowles,et al.  Arithmetic processor design for the T9000 transputer , 1991, Optics & Photonics.

[23]  Peter-Michael Seidel,et al.  Delay-optimized implementation of IEEE floating-point addition , 2004, IEEE Transactions on Computers.

[24]  Erdem Hokenek,et al.  Leading-Zero Anticipator (LZA) in the IBM RISC System/6000 Floating-Point Execution Unit , 1990, IBM J. Res. Dev..

[25]  Javier D. Bruguera,et al.  Floating-point multiply-add-fused with reduced latency , 2004, IEEE Transactions on Computers.

[26]  K. Soumyanath,et al.  Sub-500 ps 64 b ALUs in 0.18 /spl mu/m SOI/bulk CMOS: Design & scaling trends , 2001, 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177).

[27]  Vojin G. Oklobdzija,et al.  An algorithmic and novel design of a leading zero detector circuit: comparison with logic synthesis , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[28]  Eric M. Schwarz,et al.  P6 Binary Floating-Point Unit , 2007, 18th IEEE Symposium on Computer Arithmetic (ARITH '07).