Scalable Hardware-Algorithms for Binary Prefix Sums

The main contribution of this work is to propose a number of broadcast-efficient VLSI architectures for computing the sum and the prefix sums of a ω κ -bit, κ > 2, binary sequence using, as basic building blocks, linear arrays of at most ω 2 shift switches. An immediate consequence of this feature is that in our designs broadcasts are limited to buses of length at most ω 2 making them eminently practical. Using our design, the sum of a ω κ -bit binary sequence can be obtained in the time of 2κ-2 broadcasts, using 2ω κ-2 + O(ω κ-3 ) blocks, while the corresponding prefix sums can be computed in 3k - 4 broadcasts using (κ + 2)ω κ-2 + O(κω κ-3 ) blocks.

[1]  Guy E. Blelloch,et al.  Scans as Primitive Parallel Operations , 1989, ICPP.

[2]  Stephan Olariu,et al.  Reconfigurable Buses with Shift Switching: Concepts and Applications , 1995, IEEE Trans. Parallel Distributed Syst..

[3]  S. Olariu,et al.  Reconfigurable buses with shift switching-architectures and applications , 1993, Proceedings of Phoenix Conference on Computers and Communications.

[4]  J. Zhang,et al.  Fundamental data movement algorithms for reconfigurable meshes , 1992, Eleventh Annual International Phoenix Conference on Computers and Communication [1992 Conference Proceedings].

[5]  Koji Nakano Prefix-Sums Algorithms on Reconfigurable Meshes , 1995, Parallel Process. Lett..

[6]  Koji Nakano An Efficient Algorithm for Summing up Binary Values on a Reconfigurable Mesh (Special Section on Discrete Mathematics and Its Applications) , 1994 .

[7]  Stephan Olariu,et al.  Data Movement Techniques on Reconfigurable Meshes, with Applications , 1994, Int. J. High Speed Comput..