Maximally fast and arbitrarily fast implementation of linear computations

Linear systems are the most often used type of systems in many engineering and scientific areas. By estab- lishing a relationship between the basic properties of linear computations and several optimizing transformations, it is possible to optimally speed-up linear computations with respect to those transformations while keeping the latency fixed. Furthermore, arbitrarily fast, asymptotically optimal implementations can be obtained by adding retiming and loop unrolling to the transformations set and trading latency for throughput. The proposed techniques have yielded results superior to the best published previously on ail benchmark examples. Finally, the presented approach is also applicable to general (non-linear) computations. 1.0 Motivation and Prior Art The major goal of this paper is to demonstrate how for the large class of linear computations the maximally fast imple- mentation with respect to five important and powerful trans- formations (associativity, distributivity, commutativity, common subexpression and constant propagation) can be efficiently derived and to show how an arbitrarily fast, asymptotically optimal (with respect to the hardware cost) implementation of a general linear computation can be pro- duced combining those five transformations with retiming and loop unfolding. Transformations alter the organization of a computation in a such a way that the user specified input/output relationship is maintained. They are often used as an effective approach for the improvement of the implementation of computations. Their use in compilers (1,5), theoretical computer science (2) and high level synthesis (3,7,9, 111 is surveyed in (lo).

[1]  Alfred V. Aho,et al.  Principles of Compiler Design , 1977 .

[2]  M. Potkonjak,et al.  Maximally fast and arbitrarily fast implementation of linear computations (circuit layout CAD) , 1992, 1992 IEEE/ACM International Conference on Computer-Aided Design.

[3]  Robert A. Walker,et al.  A Survey of high-level synthesis systems , 1991 .

[4]  Donald A. Lobo,et al.  Redundant operator creation: a scheduling optimization technique , 1991, 28th ACM/IEEE Design Automation Conference.

[5]  Jeffrey D Ullma Computational Aspects of VLSI , 1984 .

[6]  Alfred V. Aho,et al.  Principles of Compiler Design (Addison-Wesley series in computer science and information processing) , 1977 .

[7]  Charles N. Fischer,et al.  Crafting a Compiler , 1988 .

[8]  Miodrag Potkonjak,et al.  Optimizing resource utilization using transformations , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[9]  Howard Trickey,et al.  Flamel: A High-Level Hardware Compiler , 1987, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[10]  Allan Borodin,et al.  The computational complexity of algebraic and numeric problems , 1975, Elsevier computer science library.