Design space exploration of an open-source, IP-reusable, scalable floating-point engine for embedded applications

This paper describes an open-source and highly scalable floating-point unit (FPU) for embedded systems. Our FPU is fast and efficient, due to the high parallelism of its architecture: the functional units inside the datapath can operate in parallel and independently from each other. A comparison between different versions of the FPU has been made to highlight how performance scales accordingly. Logic synthesis results show that our FPU requires 105 Kgates and runs at 400MHz on a low-power 90nm std-cells low-power technology, and requires 20K Logic Elements running at 67MHz of an Altera Stratix FPGA. The proposed FPU is supported by a software tool suite which compiles programs written using the C/C++ language. A set of DSP and 3D graphics algorithms have been benchmarked, showing that using our FPU the amount of clock cycles required to perform each algorithm is one order of magnitude smaller than what is required by its corresponding software implementation.

[1]  Cheol-Ho Jeong,et al.  The design and implementation of CalmlRISC32 floating point unit , 2000, Proceedings of Second IEEE Asia Pacific Conference on ASICs. AP-ASIC 2000 (Cat. No.00EX434).

[2]  Anshul Kumar,et al.  SoC synthesis with automatic hardware-software interface generation , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[3]  Javier Castillo,et al.  Platform based on open-source cores for industrial applications , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[4]  André DeHon,et al.  Dynamically Programmable Gate Arrays: A Step Toward Increased Computational Density , 1996 .

[5]  Jürgen Becker,et al.  An industrial/academic configurable system-on-chip project (CSoC): coarse-grain XPP-/Leon-based architecture integration , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[6]  Jürgen Becker,et al.  Simulation and rapid prototyping of flexible systems-on-a-chip for future mobile communication applications , 2000, Proceedings 11th International Workshop on Rapid System Prototyping. RSP 2000. Shortening the Path from Specification to Prototype (Cat. No.PR00668).

[7]  Jari Nurmi,et al.  A FPGA Implementation of An Open-Source Floating-Point Computation System , 2005, 2005 International Symposium on System-on-Chip.

[8]  Cheol-Ho Jeong,et al.  Cost/performance trade-off in floating-point unit design for 3D geometry processor , 1999, AP-ASIC'99. First IEEE Asia Pacific Conference on ASICs (Cat. No.99EX360).

[9]  G. Marcus,et al.  A fully synthesizable single-precision, floating-point adder/substractor and multiplier in VHDL for general and educational use , 2004, Proceedings of the Fifth IEEE International Caracas Conference on Devices, Circuits and Systems, 2004..

[10]  Jari Nurmi,et al.  Integration of a NOC-based multimedia processing platform , 2005, International Conference on Field Programmable Logic and Applications, 2005..

[11]  Jari Nurmi,et al.  COFFEE - a core for free , 2003, Proceedings. 2003 International Symposium on System-on-Chip (IEEE Cat. No.03EX748).

[12]  Chris Rowen,et al.  Engineering the Complex SOC , 2004 .

[13]  Scott Davidson,et al.  Open-source hardware , 2004, IEEE Des. Test Comput..

[14]  Joseph Sifakis,et al.  Fine grain QoS control for multimedia application software , 2005, Design, Automation and Test in Europe.

[15]  Hans-Joachim Wunderlich,et al.  Development of an audio player as system-on-a-chip using an open source platform , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[16]  Spiridon Nikolaidis,et al.  Hardware support for arbitrarily complex loop structures in embedded applications , 2005, Design, Automation and Test in Europe.