A Rigorous Framework for Fully Supporting the IEEE Standard for Floating-Point Arithmetic in High-Level Programming Languages

Processors conforming to the IEEE Standard for Floating-Point Arithmetic have been commonplace for some years, and now several programming languages seem to support or conform to this standard, from hereon referred to as “the IEEE Standard.” For example, The Java Language Specification by Gosling, Joy, and Steele, which defines the Java language, frequently mentions the IEEE Standard. Indeed, Java, as do other languages, supports some of the features of the IEEE Standard, including a couple floating-point data formats, and even requires (in section 4.2.4 “Floating-Point Operations” of the aforementioned book) that “operators on floating-point numbers behave exactly as specified by IEEE 754.” Arguing that the support current languages offer is not enough, this thesis establishes clear criteria for what it means to fully support the IEEE Standard in a programming language. Each aspect of the IEEE Standard is examined in detail from the point of view of how various arithmetic engines implement that aspect of the IEEE Standard, how different languages (and implementations thereof) support it, and what the range of options are in supporting that aspect. Practical recommendations are then offered (particularly, but not exclusively, for Ada and Java), taking, for example, programmer convenience and impact on performance into consideration. A detailed model specification following these recommendations is provided for the Ada language. In addition, a variety of issues related to the floating-point aspects of programming languages are discussed, so as to serve as a more complete guide to language designers. One such issue is floating-point expression evaluation schemes, and, more specifically, whether bit-for-bit identical results are actually achievable on a variety of platforms that conform to the IEEE Standard, as the Java language promises. Closely tied to this issue is that of double rounding, which occurs when a (possibly intermediate) result is rounded more than once before subsequent use or before being delivered to its final destination. So this thesis discusses when double rounding makes a difference, how it can be avoided, and what the performance impact is in avoiding it.

[1]  Frank Yellin,et al.  The Java Virtual Machine Specification , 1996 .

[2]  Jerome Toby Coonen Contributions to a proposed standard for binary floating-point arithmetic (computer arithmetic) , 1984 .

[3]  Guy L. Steele,et al.  The Java Language Specification , 1996 .

[4]  Ralf Hinze,et al.  Haskell 98 — A Non−strict‚ Purely Functional Language , 1999 .

[5]  Hanspeter Mössenböck,et al.  The Programming Language Oberon-2 , 1991, Struct. Program..

[6]  John R. Hauser,et al.  Handling floating-point exceptions in numeric programs , 1995, TOPL.

[7]  P. J. Plauger Floating-point C extensions , 1993 .

[8]  Paul Strauss,et al.  Motorola Inc. , 1993 .

[9]  W. S. Brown A Simple but Realistic Model of Floating-Point Computation , 1981, TOMS.

[10]  William Kahan,et al.  Miscalculating area and angles of a needle-like triangle , 1986 .

[11]  William D. Clinger How to Read Floating-Point Numbers Accurately , 1990, PLDI.

[12]  Thomas E. Hull,et al.  Exception handling in scientific computing , 1988, TOMS.

[13]  David Goldberg The design of floating-point data types , 1992, LOPL.

[14]  J. Davenport Editor , 1960 .

[15]  Richard J. Fateman High-Level Language Implications of the Proposed IEEE Floating-Point Standard , 1982, TOPL.

[16]  Guido D. Salvucci,et al.  Ieee standard for binary floating-point arithmetic , 1985 .

[17]  William J. Cody,et al.  Algorithm 722: Functions to support the IEEE standard for binary floating-point arithmetic , 1993, TOMS.

[18]  Greg Nelson,et al.  Systems programming in modula-3 , 1991 .

[19]  Charles Farnum,et al.  Compiler support for floating‐point computation , 1988, Softw. Pract. Exp..

[20]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[21]  William Kahan,et al.  Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic , 1996 .

[22]  Jean D. etc. Ichbiah Reference Manual for the ADA Programming Language , 1983 .

[23]  W. Kahan Analysis and refutation of the LCAS , 1991, SGNM.

[24]  James Demmel Underflow and the Reliability of Numerical Software , 1984 .

[25]  G. Duclos New York 1987 , 2000 .

[26]  A. Cozzolino,et al.  Powerpc microprocessor family: the programming environments , 1994 .

[27]  D. Gay Correctly Rounded Binary-Decimal and Decimal-Binary Conversions , 1990 .