Optimum and heuristic transformation techniques for simultaneous optimization of latency and throughput

Although throughput alone can be arbitrarily improved for several classes of systems using previously published techniques, none of those approaches are effective when latency constraints, which are increasingly important in embedded DSP systems, are considered. After formally establishing the relationship between latency and throughput in general computation, we explore the effect of pipelining on latency, and establish necessary and sufficient conditions under which pipelining does not alter latency. Many systems are either linear, or have subsystems that are linear. For such cases we have used a state-space based approach that treats various transformations in an integrated fashion, and answers analytically whether it is possible to simultaneously meet any given combination of constraints on latency and throughput, The analytic approach is constructive in nature, and produces a complete implementation when feasibility conditions are fulfilled. We also present a suboptimal but hardware efficient heuristic approach for the special case of initially-relaxed single-input single-output linear time-invariant computations. A novel software platform consisting of a high-level synthesis system coupled to a symbolic algebra system was used to implement the proposed algorithm transformations. Instead of optimizing to improve throughput and latency, our transformations can also be used to increase the implementation efficiency while achieving the same latency and throughput as the original design. >

[1]  A. Portela Maple V — the future of mathematics , 1992 .

[2]  Keshab K. Parhi,et al.  Pipeline interleaving and parallelism in recursive digital filters. II. Pipelined incremental block filtering , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3]  Robert A. Walker,et al.  A Survey of high-level synthesis systems , 1991 .

[4]  Richard J. Fateman,et al.  Characterization of VAX Macsyma , 1981, SYMSAC '81.

[5]  Miodrag Potkonjak,et al.  Maximally fast and arbitrarily fast implementation of linear computations , 1992, ICCAD '92.

[6]  Peter M. Kogge,et al.  The Architecture of Pipelined Computers , 1981 .

[7]  Charles N. Fischer,et al.  Crafting a Compiler , 1988 .

[8]  Thomas P. Barnwell,et al.  Optimal automatic periodic multiprocessor scheduler for fully specified flow graphs , 1993, IEEE Trans. Signal Process..

[9]  Miodrag Potkonjak,et al.  Optimizing resource utilization using transformations , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[10]  Gilles Kahn,et al.  Coroutines and Networks of Parallel Processes , 1977, IFIP Congress.

[11]  Giovanni De Micheli,et al.  High Level Synthesis of ASlCs un - der Timing and Synchronization Constraints , 1992 .

[12]  Gerhard Fettweis,et al.  Algorithm transformations for unlimited parallelism , 1990, IEEE International Symposium on Circuits and Systems.

[13]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[14]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[15]  Ramesh Karri,et al.  Transformation-based high-level synthesis of fault-tolerant ASICs , 1992, [1992] Proceedings 29th ACM/IEEE Design Automation Conference.

[16]  Keshab K. Parhi,et al.  Algorithm transformation techniques for concurrent processors , 1989, Proc. IEEE.

[17]  Miodrag Potkonjak,et al.  Pipelining: just another transformation , 1992, [1992] Proceedings of the International Conference on Application Specific Array Processors.

[18]  Lawrence S. Kroll Mathematica--A System for Doing Mathematics by Computer. , 1989 .

[19]  D. Delchamps State Space and Input-Output Linear Systems , 1987 .

[20]  Hugo De Man Design technology research for the nineties: more of the same? , 1992, EURO-DAC.

[21]  Allan O. Steinhardt,et al.  Fast algorithms for digital signal processing , 1986, Proceedings of the IEEE.

[22]  Joos Vandewalle,et al.  Loop Optimization in Register-Transfer Scheduling for DSP-Systems , 1989, 26th ACM/IEEE Design Automation Conference.

[23]  Gilles Kahn,et al.  The Semantics of a Simple Language for Parallel Programming , 1974, IFIP Congress.

[24]  Keshab K. Parhi,et al.  Pipeline interleaving and parallelism in recursive digital filters. I. Pipelining using scattered look-ahead and decomposition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[25]  Edward A. Lee,et al.  Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing , 1989, IEEE Transactions on Computers.

[26]  Jacob Shekel Analysis of linear networks , 1957 .

[27]  Donald E. Thomas,et al.  Behavioral transformation for algorithmic level IC design , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[28]  Richard D. Jenks,et al.  AXIOM: the scientific computation system , 1992 .

[29]  James H. Davenport,et al.  Computer Algebra: Systems and Algorithms for Algebraic Computation , 1988 .

[30]  Miodrag Potkonjak,et al.  Fast prototyping of datapath-intensive architectures , 1991, IEEE Design & Test of Computers.

[31]  S. Kung,et al.  VLSI Array processors , 1985, IEEE ASSP Magazine.

[32]  John H. Reif,et al.  Synthesis of Parallel Algorithms , 1993 .

[33]  Stephen A. Dyer,et al.  Digital signal processing , 2018, 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004..

[34]  Howard Trickey,et al.  Flamel: A High-Level Hardware Compiler , 1987, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[35]  Bernard Friedland,et al.  Control System Design: An Introduction to State-Space Methods , 1987 .

[36]  Alice C. Parker,et al.  Sehwa: a software package for synthesis of pipelines from behavioral specifications , 1988, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[37]  Miodrag Potkonjak,et al.  HYPER-LP: a system for power minimization using architectural transformations , 1992, ICCAD.

[38]  John A. Stankovic,et al.  Real-time computing systems: the next generation , 1988 .

[39]  Robert W. Brodersen Anatomy of a Silicon Compiler , 1992 .

[40]  James H. Davenport,et al.  Scratchpad's View of Algebra I: Basic Commutative Algebra , 1990, DISCO.

[41]  Mohamed I. Elmasry,et al.  Architectural synthesis for DSP silicon compilers , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..