Faster floating-point square root for integer processors

This paper presents some work in progress on fast and accurate floating-point arithmetic software for ST200-based embedded systems. We show how to use some key architectural features to design codes that achieve correct rounding-to-nearest without sacrificing for efficiency. This is illustrated with the square root function, whose implementation given here is faster by over 35% than the previously best one for such systems.

[1]  bob. norin IA-64 Floating-Point Operations and the IEEE Standard for Binary Floating-Point Arithmetic , 1999 .

[2]  Marco Mezzalama,et al.  Survey of Square Rooting Algorithms , 1990 .

[3]  Ping Tak Peter Tang,et al.  An overview of floating-point support and math library on the Intel/spl reg/ XScale/spl trade/ architecture , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[4]  John Harrison Formal Verification of Square Root Algorithms , 2003, Formal Methods Syst. Des..

[5]  Milos D. Ercegovac,et al.  Digital Arithmetic , 2003, Wiley Encyclopedia of Computer Science and Engineering.

[6]  Peter W. Markstein Computation of Elementary Functions on the IBM RISC System/6000 Processors , 1990, IBM J. Res. Dev..

[7]  C. Bruel If-conversion for embedded VLIW architectures , 2009, Int. J. Embed. Syst..

[8]  Peter W. Markstein,et al.  Software Division and Square Root Using Goldschmidt's Algorithms , 2004 .

[9]  John Harrison,et al.  Scientific Computing on the Itanium ™ Processor , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[10]  John Harrison,et al.  Scientific Computing on Itanium-Based Systems , 2002 .

[11]  Jean-Michel Muller,et al.  Elementary Functions: Algorithms and Implementation , 1997 .

[12]  Arnaud Tisserand,et al.  A floating-point library for integer processors , 2004, SPIE Optics + Photonics.

[13]  Paolo Faraboschi,et al.  Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools , 2004 .

[14]  Peter Tang,et al.  The Computation of Transcendental Functions on the IA-64 Architecture , 1999 .

[15]  Lieven Eeckhout,et al.  Analyzing the Processor Bottlenecks in SPEC CPU2000 , 2006 .

[16]  Ramesh C. Agarwal,et al.  Series approximation methods for divide and square root in the Power3/sup TM/ processor , 1999, Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336).

[17]  Michael L. Overton,et al.  Numerical Computing with IEEE Floating Point Arithmetic , 2001 .

[18]  Ping Tak Peter Tang,et al.  An Overview of Floating-Point Support and Math Library on the Intel XScale Architecture , 2003 .

[19]  Peter W. Markstein,et al.  IA-64 and elementary functions - speed and precision , 2000 .

[20]  P. Faraboschi,et al.  Lx: a technology platform for customizable VLIW embedded processing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).