How to Square Floats Accurately and Efficiently on the ST231 Integer Processor

We consider the problem of computing IEEE floating-point squares by means of integer arithmetic. We show how to exploit the specific properties of squaring in order to design and implement algorithms that have much lower latency than those for general multiplication, while still guaranteeing correct rounding. Our algorithms are parameterized by the floating-point format, aim at high instruction-level parallelism (ILP) exposure, and cover all rounding modes. We show further that their C implementation for the binary32 format yields efficient codes for targets like the ST231 VLIW integer processor from ST Microelectronics, with a latency at least 1.75x smaller than that of general multiplication in the same context.

[1]  M.J. Schulte,et al.  Combined unsigned and two's complement hybrid squarers , 2001, Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers (Cat.No.01CH37256).

[2]  William H. Press,et al.  Book-Review - Numerical Recipes in Pascal - the Art of Scientific Computing , 1989 .

[3]  Jean-Michel Muller,et al.  Techniques and tools for implementing IEEE 754 floating-point arithmetic on VLIW integer processors , 2010, PASCO.

[4]  Milos D. Ercegovac,et al.  Digital Arithmetic , 2003, Wiley Encyclopedia of Computer Science and Engineering.

[5]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[6]  William H. Press,et al.  Numerical recipes , 1990 .

[7]  Ronald L. Graham,et al.  Concrete mathematics - a foundation for computer science , 1991 .

[8]  J. L. Blue,et al.  A Portable Fortran Program to Find the Euclidean Norm of a Vector , 1978, TOMS.

[9]  Jean-Michel Muller,et al.  Handbook of Floating-Point Arithmetic (2nd Ed.) , 2018 .

[10]  Nicholas J. Higham,et al.  INVERSE PROBLEMS NEWSLETTER , 1991 .

[11]  Arnaud Tisserand,et al.  A floating-point library for integer processors , 2004, SPIE Optics + Photonics.

[12]  Mustafa Gök,et al.  Integer squarers with overflow detection , 2008, Comput. Electr. Eng..

[13]  James Demmel,et al.  IEEE Standard for Floating-Point Arithmetic , 2008 .

[14]  Guillaume Revy,et al.  Implementation of binary floating-point arithmetic on embedded integer processors - Polynomial evaluation-based algorithms and certified code generation , 2009 .

[15]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[16]  Ping Tak Peter Tang,et al.  An Overview of Floating-Point Support and Math Library on the Intel XScale Architecture , 2003 .