167 MHz radix-8 divide and square root using overlapped radix-2 stages

UltraSPARC's IEEE-754 compliant floating point divide and square root implementation is presented. Three overlapping stages of SRT radix-2 quotient selection logic enable an effective radix-8 calculation at 167 MHz while only a single radix-2 quotient selection logic delay is seen in the critical path. Speculative partial remainder and quotient calculation in the main datapath also improves cycle time. The quotient selection logic is slightly modified to prevent the formation of a negative partial remainder for exact results. This saves latency and hardware as the partial remainder no longer needs to be restored before calculating the sticky bit for rounding.<<ETX>>

[1]  Luigi Ciminiera,et al.  Simple radix 2 division and square root with skipping of some addition steps , 1991, [1991] Proceedings 10th IEEE Symposium on Computer Arithmetic.

[2]  D. Zuras,et al.  Balanced delay trees and combinatorial division in VLSI , 1986 .

[3]  James E. Robertson,et al.  A New Class of Digital Division Methods , 1958, IRE Trans. Electron. Comput..

[4]  Tomás Lang,et al.  On-the-Fly Rounding , 1992, IEEE Trans. Computers.

[5]  Stanislaw Majerski Square-Rooting Algorithms for High-Speed Digital Circuits , 1985, IEEE Transactions on Computers.

[6]  Stanislaw Majerski,et al.  Square-root algorithms for high-speed digital circuits , 1983, 1983 IEEE 6th Symposium on Computer Arithmetic (ARITH).

[7]  K. D. Tocher TECHNIQUES OF MULTIPLICATION AND DIVISION FOR AUTOMATIC BINARY COMPUTERS , 1958 .

[8]  Marc Tremblay,et al.  A Fast and Flexible Performance Simulator for Micro-Architecture Trade-off Analysis on UltraSPARC™ -I , 1995, 32nd Design Automation Conference.

[9]  Tomás Lang,et al.  Radix-4 Square Root Without Initial PLA , 1990, IEEE Trans. Computers.

[10]  Sridhar Samudrala,et al.  On the implementation of shifters, multipliers, and dividers in VLSI floating point units , 1987, 1987 IEEE 8th Symposium on Computer Arithmetic (ARITH).

[11]  George S. Taylor Radix 16 SRT dividers with overlapped quotient selection stages: A 225 nanosecond double precision divider for the S-1 Mark IIB , 1985, 1985 IEEE 7th Symposium on Computer Arithmetic (ARITH).