Optimizing systems for effective block-processing: the k-delay problem

Block-processing is a powerful and popular technique for increasing computation speed by simultaneously processing several samples of data. The effectiveness of block-processing is often reduced, however, due to suboptimal placement of delays in the dataflow graph of a computation. In this paper we investigate an application of the retiming transformation for improving the effectiveness of block-processing in computation structures. Specifically, we consider the k-delay problem in which we wish to retime any given computation so that given an integer k the resulting computation can process k data samples simultaneously in a fully regular manner. Our main contribution is an O(V/sup 3/E+V/sup 4/ log V)-time algorithm for the L-delay problem, where V is the number of computation blocks and E is the number of interconnections in the computation.

[1]  Allan O. Steinhardt,et al.  Fast algorithms for digital signal processing , 1986, Proceedings of the IEEE.

[2]  W. Wayt Gibbs,et al.  Software's Chronic Crisis , 1994 .

[3]  Joseph Naor,et al.  Simple and Fast Algorithms for Linear and Integer Programs With Two Variables per Inequality , 1994, SIAM J. Comput..

[4]  Heinrich Meyr,et al.  Optimum vectorization of scalable synchronous dataflow graphs , 1993, Proceedings of International Conference on Application Specific Array Processors (ASAP '93).

[5]  Miodrag Potkonjak,et al.  System-level design guidance using algorithm properties , 1994, Proceedings of 1994 IEEE Workshop on VLSI Signal Processing.

[6]  David Eppstein,et al.  Finding the k Shortest Paths , 1999, SIAM J. Comput..

[7]  Heinrich Meyr,et al.  Retiming of DSP programs for optimum vectorization , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Giovanni De Micheli,et al.  Relative scheduling under timing constraints: algorithms for high-level synthesis of digital circuits , 1992, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[9]  S. Kung,et al.  VLSI Array processors , 1985, IEEE ASSP Magazine.

[10]  Robert K. Brayton,et al.  Retiming and resynthesis: optimizing sequential networks with combinational techniques , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[11]  Donald E. Thomas,et al.  Behavioral transformation for algorithmic level IC design , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[12]  Marios C. Papaefthymiou,et al.  DelaY: An Efficient Tool for Retiming with Realistic Delay Modeling , 1995, 32nd Design Automation Conference.

[13]  Marios C. Papaefthymiou,et al.  Optimizing two-phase, level-clocked circuitry , 1997, JACM.

[14]  Kurt Keutzer,et al.  Storage assignment to decrease code size , 1995, PLDI '95.

[15]  Giovanni De Micheli,et al.  Synchronous logic synthesis: algorithms for cycle-time minimization , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..