论文信息 - FPU implementations with denormalized numbers

FPU implementations with denormalized numbers

Denormalized numbers are the most difficult type of numbers to implement in floating-point units. They are so complex that certain designs have elected to handle them in software rather than in hardware. Traps to software can result in long execution times, which renders denormalized numbers useless to programmers. This does not have to happen. With a small amount of additional hardware, denormalized numbers and underflows can be handled close to the speed of normalized numbers. This paper summarizes the little known techniques for handling denormalized numbers. Most of the techniques described here only appear in filed or pending patent applications.

Eric M. Schwarz | Martin S. Schmookler | Son Dao Trong

[1] Charles Roth,et al. A low-power, high-speed implementation of a PowerPC/sup TM/ microprocessor vector extension , 1999, Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336).

[2] Warren James,et al. 1 GHz HAL SPARC64/sup R/ Dual Floating Point Unit with RAS features , 2001, Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001.

[3] Peter W. Cook,et al. Second-generation RISC floating point with multiply-add fused , 1990 .

[4] Gregory B. Zyner,et al. 167 MHz radix-4 floating point multiplier , 1995, Proceedings of the 12th Symposium on Computer Arithmetic.

[5] Stamatis Vassiliadis,et al. A General Proof for Overlapped Multiple-Bit Scanning Multiplications , 1989, IEEE Trans. Computers.

[6] Eric M. Schwarz,et al. Hardware implementations of denormalized numbers , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[7] Christopher A. Krygowski,et al. The S/390 G5 floating-point unit , 1999, IBM J. Res. Dev..

[8] Cathy May,et al. The PowerPC Architecture: A Specification for a New Family of RISC Processors , 1994 .

[9] Erdem Hokenek,et al. Design of the IBM RISC System/6000 Floating-Point Execution Unit , 1990, IBM J. Res. Dev..

[10] T. Lang,et al. Floating-point fused multiply-add with reduced latency , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[11] Eric M. Schwarz,et al. High performance floating-point unit with 116 bit wide divider , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[12] Peter W. Markstein,et al. IA-64 and elementary functions - speed and precision , 2000 .