Analysis of the impact of different methods for division/square root computation in the performance of a superscalar microprocessor

An analysis of the impact of different methods for the double-precision computation of division and square root in the performance of a superscalar processor is presented in this paper. This analysis is carried out combining the SimpleScalar toolset, estimates of the latency and throughput of the compared methods and a set of benchmarks with typical features of intensive computing applications. Simulation results show the importance of having an efficient unit for the computation of these operations, since changes in the density of division and square root below 1% lead to changes in the performance around a 20%.

[1]  Tomás Lang,et al.  On-the-Fly Conversion of Redundant into Conventional Representations , 1987, IEEE Transactions on Computers.

[2]  Mark Meyer,et al.  Implicit fairing of irregular meshes using diffusion and curvature flow , 1999, SIGGRAPH.

[3]  K. J. Ray Liu,et al.  A class of square root and division free algorithms and architectures for QRD-based adaptive signal processing , 1994, IEEE Trans. Signal Process..

[4]  Javier D. Bruguera,et al.  High-Speed Double-Precision Computation of Reciprocal, Division, Square Root and Inverse Square Root , 2002, IEEE Trans. Computers.

[5]  Arnaud Tisserand,et al.  Reciprocation, Square Root, Inverse Square Root, and Some Elementary Functions Using Small Multipliers , 2000, IEEE Trans. Computers.

[6]  M. Ercegovac,et al.  Division and Square Root: Digit-Recurrence Algorithms and Implementations , 1994 .

[7]  Tomás Lang,et al.  Very High Radix Square Root with Prescaling and Rounding and a Combined Division/Square Root Unit , 1999, IEEE Trans. Computers.

[8]  Gene H. Golub,et al.  Matrix computations , 1983 .

[9]  David W. Matula Improved table lookup algorithms for postscaled division , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[10]  Hans-Peter Seidel,et al.  Interactive multi-resolution modeling on arbitrary meshes , 1998, SIGGRAPH.

[11]  Michael J. Flynn,et al.  Design Issues in Floating-Point Division , 1994 .

[12]  Javier D. Bruguera,et al.  Computation of sqrt(x/d) in a Very High Radix Combined Division/Square-Root Unit with Scaling , 1998, IEEE Trans. Computers.

[13]  Tomás Lang,et al.  On-the-Fly Rounding , 1992, IEEE Trans. Computers.

[14]  Javier D. Bruguera,et al.  Faithful powering computation using table look-up and a fused accumulation tree , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[15]  Todd M. Austin,et al.  SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.

[16]  Weng-Fai Wong,et al.  Fast Hardware-Based Algorithms for Elementary Function Computations Using Rectangular Multipliers , 1994, IEEE Trans. Computers.

[17]  Miriam Leeser,et al.  Area and performance tradeoffs in floating-point divide and square-root implementations , 1996, CSUR.

[18]  Tomás Lang,et al.  Very high radix division with selection by rounding and prescaling , 1993, Proceedings of IEEE 11th Symposium on Computer Arithmetic.

[19]  Emilio L. Zapata,et al.  Bidimensional shape detection using an invariant approach , 1999, Pattern Recognit..

[20]  Inmaculada García,et al.  Deformable shapes detection by stochastic optimization , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[21]  Gurindar S. Sohi,et al.  Instruction Issue Logic for High-Performance Interruptible, Multiple Functional Unit, Pipelines Computers , 1990, IEEE Trans. Computers.

[22]  Miriam Leeser,et al.  Division and square root: choosing the right implementation , 1997, IEEE Micro.

[23]  Stuart F. Oberman,et al.  Floating point division and square root algorithms and implementation in the AMD-K7/sup TM/ microprocessor , 1999, Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336).

[24]  Kenneth C. Yeager The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.

[25]  Michael J. Flynn,et al.  Design Issues in Division and Other Floating-Point Operations , 1997, IEEE Trans. Computers.

[26]  Shuzo Yajima,et al.  Efficient Initial Approximation for Multiplicative Division and Square Root by a Multiplication with Operand Modification , 1997, IEEE Trans. Computers.

[27]  Michael J. Flynn,et al.  Division Algorithms and Implementations , 1997, IEEE Trans. Computers.

[28]  Tomás Lang,et al.  Very-High Radix Division with Prescaling and Selection by Rounding , 1994, IEEE Trans. Computers.