Arithmetic around the bit heap

A bit heap is a data structure that holds the unevaluated sum of an arbitrary number of bits, each weighted by some power of two. Any multivariate polynomial of binary inputs can be expressed as a bit heap whose bits are simple boolean functions of the input bits. For many large arithmetic designs, viewing them as bit heaps is more relevant than viewing them as a composition of adders and multipliers. It leads to better global optimization at both the algebraic level and the circuit level. However, this notion needs to be supported by tools. This article therefore discusses a generic software framework for the definition, optimization and compression of bit heaps. It is specifically directed towards FPGAs, where complex and application-specific arithmetic circuits must be developped in little time. For this purpose, the textbook notion of a bit array is refined in several ways. Firstly, a bit heap should accept bits arriving at various instants in circuit time, and the bit heap compression process must take this timing into account. Secondly, the DSP blocks of recent FPGAs must be integrated in the bit heap view. Thirdly, the management of signed bit heaps is detailed, and shown to entail no overhead. Finally, a new family of elementary compressors on FPGAs improves upon the state of the art.

[1]  Mohammad Ghodsi,et al.  An Efficient Universal Addition Scheme for All Hybrid-Redundant Representations with Weighted Bit-Set Encoding , 2006, J. VLSI Signal Process..

[2]  Bogdan Pasca Correctly rounded floating-point division for DSP-enabled FPGAs , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[3]  Charles E. Leiserson,et al.  Retiming synchronous circuitry , 1988, Algorithmica.

[4]  It Informatics On-Line Encyclopedia of Integer Sequences , 2010 .

[5]  Florent de Dinechin,et al.  Designing Custom Arithmetic Data Paths with FloPoCo , 2011, IEEE Design & Test of Computers.

[6]  R. Ravi,et al.  Optimal Circuits for Parallel Multipliers , 1998, IEEE Trans. Computers.

[7]  Christopher S. Wallace,et al.  A Suggestion for a Fast Multiplier , 1964, IEEE Trans. Electron. Comput..

[8]  Naofumi Takagi,et al.  Function evaluation by table look-up and addition , 1995, Proceedings of the 12th Symposium on Computer Arithmetic.

[9]  Bogdan Pasca,et al.  FPGA-Specific Arithmetic Optimizations of Short-Latency Adders , 2011, 2011 21st International Conference on Field Programmable Logic and Applications.

[10]  Paolo Ienne,et al.  Compressor tree synthesis on commercial high-performance FPGAs , 2011, TRETS.

[11]  Florent de Dinechin,et al.  Multipliers for floating-point double precision and beyond on FPGAs , 2011, CARN.

[12]  Milos D. Ercegovac,et al.  Digital Arithmetic , 2003, Wiley Encyclopedia of Computer Science and Engineering.

[13]  Vojin G. Oklobdzija,et al.  A Method for Speed Optimized Partial Product Reduction and Generation of Fast Parallel Multipliers Using an Algorithmic Approach , 1996, IEEE Trans. Computers.

[14]  E. Swartzlander Merged Arithmetic , 1980, IEEE Transactions on Computers.

[15]  Bruce A. Wooley,et al.  A Two's Complement Parallel Array Multiplication Algorithm , 1973, IEEE Transactions on Computers.

[16]  Paolo Ienne,et al.  Data-Flow Transformations to Maximize the Use of Carry-Save Representation in Arithmetic Circuits , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.