Behavioral level guidance using property-based design characterization

The growing importance of optimization, short time to market windows, and exponentially growing design complexity are just a few of the factors shaping the state-of-the-art synthesis process. In particular, optimization at the early stages of design is crucial--at the system and behavioral levels, orders of magnitude performance improvement in key design metrics such as throughput, power, and area can be attained. This requires, however, strategic and coordinated application of design techniques best suited for a target design. The problem, however, is the number of options currently available is overwhelming, and as a result, design exploration is often conducted in a qualitative, ad-hoc manner. To address these challenges, this thesis introduces a new design methodology for guiding the exploration process to quickly find effective sequences of design optimizations. The building blocks of the methodology are quantitative design characterization and a library of characterized optimization techniques. Design characterization is done using a set of techniques to automatically extract the "essence" of a design description. The library of characterized optimization techniques encapsulates knowledge about the effectiveness, scope, and interdependencies of various optimizations. These two building blocks enable analysis of optimization alternatives, and have been encapsulated in an interactive guidance environment. The guidance environment suggests and ranks potential optimizations, both in terms of immediate and longer-term impact. It also provides evaluations of the design and of the likely effects each optimization will have on performance. Using the provided guidance, designers can make decisions in a more informed manner and can explore the space more effectively, thus resulting in shorter design time and more highly optimized designs. A core contribution of this thesis is the design characterization. The essence of the design is captured using property metrics that are shown to be related to the quality of algorithm-architecture mappings. The following properties and their quantifications are presented: size, topology, timing, concurrency, uniformity, locality, and regularity. As well as being a key component of the guidance methodology, this work demonstrates the effectiveness of using property metrics in algorithm selection, performance estimation, and architectural synthesis.

[1]  Mark N. Wegman,et al.  Constant propagation with conditional branches , 1985, POPL.

[2]  Robert A. Walker,et al.  A Survey of high-level synthesis systems , 1991 .

[3]  J. M. Rabaey,et al.  An integrated framework for optimizing transformations , 1996, VLSI Signal Processing, IX.

[4]  Fadi J. Kurdahi,et al.  Techniques for area estimation of VLSI layouts , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[5]  David G. Messerschmitt,et al.  Breaking the Recursive Bottleneck , 1988 .

[6]  Miodrag Potkonjak,et al.  Efficient Substitution of Multiple Constant Multiplications by Shifts and Additions Using Iterative Pairwise Matching , 1994, 31st Design Automation Conference.

[7]  Miodrag Potkonjak,et al.  Maximally fast and arbitrarily fast implementation of linear computations , 1992, ICCAD '92.

[8]  Hugo De Man,et al.  Cathedral-III : architecture-driven high-level synthesis for high throughput DSP applications , 1991, 28th ACM/IEEE Design Automation Conference.

[9]  M. Barnes,et al.  Design methodology management , 1994 .

[10]  Miodrag Potkonjak,et al.  Instruction set mapping for performance optimization , 1993, ICCAD.

[11]  K. Keutzer,et al.  The impact of CAD on the design of low power digital circuits , 1994, Proceedings of 1994 IEEE Symposium on Low Power Electronics.

[12]  Jan M. Rabaey,et al.  Low power design of memory intensive functions. Case study: vector quantization , 1994, Proceedings of 1994 IEEE Workshop on VLSI Signal Processing.

[13]  Robert W. Brodersen Anatomy of a Silicon Compiler , 1992 .

[14]  Hugo De Man,et al.  Quadratic zero-one programming based synthesis of application specific data paths , 1993, ICCAD '93.

[15]  M. Potkonjak,et al.  Energy efficient implementation of linear systems on programmable processors , 1995, VLSI Signal Processing, VIII.

[16]  Fadi J. Kurdahi,et al.  An approach to scheduling and allocation using regularity extraction , 1993, 1993 European Conference on Design Automation with the European Event in ASIC Design.

[17]  Alan V. Oppenheim,et al.  Discrete-Time Signal Pro-cessing , 1989 .

[18]  Jan M. Rabaey,et al.  Maximizing the throughput of high performance DSP applications using behavioral transformations , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[19]  Miodrag Potkonjak,et al.  Performance optimization using template mapping for datapath-intensive high-level synthesis , 1996, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[20]  Pierre G. Paulin,et al.  Force-directed scheduling for the behavioral synthesis of ASICs , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[21]  W. M. McKeeman,et al.  Peephole optimization , 1965, CACM.

[22]  Abhijit Chatterjee,et al.  Greedy hardware optimization for linear digital circuits using number splitting and refactorization , 1993, IEEE Trans. Very Large Scale Integr. Syst..

[23]  A. R. Newton,et al.  Electronic CAD Frameworks , 1992 .

[24]  Alice C. Parker,et al.  Predicting system-level area and delay for pipelined and nonpipelined designs , 1992, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[25]  Etienne Morel,et al.  Global optimization by suppression of partial redundancies , 1979, CACM.

[26]  Herman Schmit,et al.  A Model and Methodology for Hardware-Software Codesign , 1993, IEEE Des. Test Comput..

[27]  Michael Wolfe,et al.  Where are the optimizing compilers? , 1985, SIGP.

[28]  Roger Lipsett,et al.  VHDL: hardware description and design , 1989 .

[29]  Monica S. Lam,et al.  A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..

[30]  Daniel D. Gajski,et al.  High ― Level Synthesis: Introduction to Chip and System Design , 1992 .

[31]  C.H. Sequin Managing VLSI complexity: An outlook , 1983, Proceedings of the IEEE.

[32]  Miodrag Potkonjak,et al.  Optimizing throughput and resource utilization using pipelining: Transformation based approach , 1994, J. VLSI Signal Process..

[33]  H. De Man,et al.  Global communication and memory optimizing transformations for low power signal processing systems , 1994, Proceedings of 1994 IEEE Workshop on VLSI Signal Processing.

[34]  Miodrag Potkonjak,et al.  Complexity Estimation for Real Time Application Specific Circuits , 1991, ESSCIRC '91: Proceedings - Seventeenth European Solid-State Circuits Conference.

[35]  Gerhard Fettweis,et al.  Algorithm transformations for unlimited parallelism , 1990, IEEE International Symposium on Circuits and Systems.

[36]  Phu Dinh Hoang,et al.  Compiling real-time digital signal processing applications onto multiprocessor systems , 1992 .

[37]  Jordi Cortadella,et al.  High-level synthesis techniques for reducing the activity of functional units , 1995, ISLPED '95.

[38]  A. R. Zinsmeister,et al.  Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building, by G. E. P. Box, W. G. Hunter, and J. S. Hunter , 1981 .

[39]  Miodrag Potkonjak,et al.  Critical Path Minimization Using Retiming and Algebraic Speed-Up , 1993, 30th ACM/IEEE Design Automation Conference.

[40]  Sun-Yuan Kung On supercomputing with systolic/wavefront array processors , 1984, Proceedings of the IEEE.

[41]  Rajiv Jain MOSP: module selection for pipelined designs with multi-cycle operations , 1990, 1990 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers.

[42]  Miodrag Potkonjak,et al.  Design of high throughput, low latency and low cost structures for linear systems , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[43]  Juan Carlos López,et al.  Design assistance for CAD frameworks , 1992, Proceedings EURO-DAC '92: European Design Automation Conference.

[44]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[45]  Miodrag Potkonjak,et al.  Fast prototyping of datapath-intensive architectures , 1991, IEEE Design & Test of Computers.

[46]  Jan M. Rabaey,et al.  Exploiting regularity for low-power design , 1996, Proceedings of International Conference on Computer Aided Design.

[47]  Robert K. Brayton,et al.  Timing optimization of combinational logic , 1988, [1988] IEEE International Conference on Computer-Aided Design (ICCAD-89) Digest of Technical Papers.

[48]  Jie Gong,et al.  Software estimation from executable specifications , 1994 .

[49]  Norman P. Jouppi,et al.  Available instruction-level parallelism for superscalar and superpipelined machines , 1989, ASPLOS III.

[50]  Miodrag Potkonjak,et al.  A Scheduling and Resource Allocation Algorithm for Hierarchical Signal Flow Graphs , 1989, 26th ACM/IEEE Design Automation Conference.

[51]  Fadi J. Kurdahi,et al.  Partitioning by regularity extraction , 1992, [1992] Proceedings 29th ACM/IEEE Design Automation Conference.

[52]  Keshab K. Parhi,et al.  Determining the minimum iteration period of an algorithm , 1995, J. VLSI Signal Process..

[53]  Edward A. Lee,et al.  Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing , 1989, IEEE Transactions on Computers.

[54]  Kenneth M. Hall An r-Dimensional Quadratic Placement Algorithm , 1970 .

[55]  Alok Sharma,et al.  Estimating Architectural Resources and Performance for High-Level Synthesis Applications , 1993, 30th ACM/IEEE Design Automation Conference.

[56]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[57]  Hendrikus J. M. Veendrick,et al.  Short-circuit dissipation of static CMOS circuitry and its impact on the design of buffer circuits , 1984 .

[58]  Hugo De Man,et al.  Global Communication and Memory Optimizing Transformations for Low Power Systems , 1994 .

[59]  Thomas Kailath,et al.  Linear Systems , 1980 .

[60]  Jan M. Rabaey,et al.  Specification and support for multidimensional DSP in the SILAGE language , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[61]  Ming Li,et al.  Kolmogorov Complexity and its Applications , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[62]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[63]  J. K. Skwirzynski Performance limits in communication theory and practice , 1988 .

[64]  Donald E. Knuth,et al.  An empirical study of FORTRAN programs , 1971, Softw. Pract. Exp..

[65]  Richard I. Hartley,et al.  Tree-height minimization in pipelined architectures , 1989, 1989 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers.

[66]  David W. Knapp,et al.  The ADAM design planning engine , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[67]  Thomas Lengauer,et al.  Combinatorial algorithms for integrated circuit layout , 1990, Applicable theory in computer science.

[68]  Niraj K. Jha,et al.  An iterative improvement algorithm for low power data path synthesis , 1995, ICCAD.

[69]  Keshab K. Parhi,et al.  High-level algorithm and architecture transformations for DSP synthesis , 1995, J. VLSI Signal Process..

[70]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[71]  Edward A. Lee,et al.  Scheduling dynamic dataflow graphs with bounded memory using the token flow model , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[72]  J. S. Hunter,et al.  Statistics for experimenters : an introduction to design, data analysis, and model building , 1979 .

[73]  Miodrag Potkonjak,et al.  Divide-and-conquer techniques for global throughput optimization , 1996, VLSI Signal Processing, IX.

[74]  Syed A. Rizvi Analyzing the tolerance and controls on critical dimensions and overlays as prescribed by the National Technology Roadmap for Semiconductors , 1997, Other Conferences.

[75]  Miodrag Potkonjak,et al.  Performance optimization of sequential circuits by eliminating retiming bottlenecks , 1992, ICCAD.

[76]  Edward A. Lee,et al.  Scheduling synchronous dataflow graphs for efficient looping , 1993, J. VLSI Signal Process..

[77]  Miodrag Potkonjak,et al.  Optimizing resource utilization using transformations , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[78]  Mohamed Jamal Zemerly,et al.  A Layered Approach to the Characterisation of Parallel Systems for Performance Prediction , 1993 .

[79]  B. Mohar THE LAPLACIAN SPECTRUM OF GRAPHS y , 1991 .

[80]  H. T. Kung New Algorithms and Lower Bounds for the Parallel Evaluation of Certain Rational Expressions and Recurrences , 1976, JACM.

[81]  Alexandru Nicolau,et al.  Measuring the Parallelism Available for Very Long Instruction Word Architectures , 1984, IEEE Transactions on Computers.

[82]  Alfred V. Aho,et al.  Principles of Compiler Design , 1977 .

[83]  Kai Hwang,et al.  Computer architecture and parallel processing , 1984, McGraw-Hill Series in computer organization and architecture.

[84]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[85]  S. Kung,et al.  VLSI Array processors , 1985, IEEE ASSP Magazine.

[86]  Lynn Conway,et al.  Introduction to VLSI systems , 1978 .

[87]  Manoj Kumar,et al.  Measuring Parallelism in Computation-Intensive Scientific/Engineering Applications , 1988, IEEE Trans. Computers.

[88]  Thomas D. Burd Low-Power CMOS Library Design Methodology , 1994 .

[89]  J. Rabaey,et al.  Behavioral Level Power Estimation and Exploration , 1997 .

[90]  Miodrag Potkonjak,et al.  System-level design guidance using algorithm properties , 1994, Proceedings of 1994 IEEE Workshop on VLSI Signal Processing.

[91]  Miodrag Potkonjak,et al.  Optimum and heuristic transformation techniques for simultaneous optimization of latency and throughput , 1995, IEEE Trans. Very Large Scale Integr. Syst..

[92]  Jacob Shekel Analysis of linear networks , 1957 .

[93]  E. Avenhaus On the design of digital filters with coefficients of limited word length , 1972 .

[94]  Miodrag Potkonjak,et al.  Optimizing power using transformations , 1995, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[95]  Geoffrey C. Fox,et al.  Code Generation by a Generalized Neural Network: General Principles and Elementary Examples , 1989, J. Parallel Distributed Comput..

[96]  Jan M. Rabaey,et al.  Design guidance in the power dimension , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[97]  Mary Lou Soffa,et al.  An approach to ordering optimizing transformations , 1990, PPOPP '90.

[98]  Edward A. Lee,et al.  A hardware-software codesign methodology for DSP applications , 1993, IEEE Design & Test of Computers.

[99]  David L. Kuck,et al.  The Structure of Computers and Computations , 1978 .

[100]  J. S. Hunter,et al.  Statistics for Experimenters: An Introduction to Design, Data Analysis and Model Building. , 1979 .