Automatic performance model synthesis from hardware verification models

Performance models are typically written by hand for a new model or assembled piece-meal from the prior simulation code of an old model. In either case, many man-months of work may be required to write the new model and validate design details against a prior or current design. In reality, the majority of information about the performance of the design already exists in the design structure of either the old hardware model or the new model or both. To harvest this information and eliminate the significant duplicate coding and validation efforts, we propose that a performance model be automatically synthesized from a prior or current hardware design using a bottom-up, design-oriented approach. We demarcate the performance-critical boundaries of the design and perform backward-trace cone analysis to identify logic to include in the performance model. We then abstract specific components for design changes and expend modeling effort only on the few functions relevant to a particular design study. Engineering effort then becomes focused on workload selection and quality, defining and projecting new designs, and assessing design tradeoffs and sensitivities - the small set of tasks with the highest potential to improve design performance. We present a case-study that shows that even the simplest proposed transformations on a high-performance IBM L2 cache design result in a simulation speedup of 3.9, with evidence that an order of magnitude speedup can be obtained using a few additional modeling abstractions.

[1]  Cédric Augonnet,et al.  Automatic Calibration of Performance Models on Heterogeneous Multicore Architectures , 2009, Euro-Par Workshops.

[2]  Eric M. Schwarz,et al.  IBM POWER6 microarchitecture , 2007, IBM J. Res. Dev..

[3]  Lizy Kurian John,et al.  Automatic testcase synthesis and performance model validation for high performance PowerPC processors , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[4]  Lizy Kurian John,et al.  Improved automatic testcase synthesis for performance model validation , 2005, ICS '05.

[5]  Daniel Brand,et al.  Early analysis tools for system-on-a-chip design , 2002, IBM J. Res. Dev..

[6]  Jason Baumgartner,et al.  Functional verification of the POWER4 microprocessor and POWER4 multiprocessor system , 2002, IBM J. Res. Dev..

[7]  Andy D. Pimentel,et al.  Calibration of Abstract Performance Models for System-Level Design Space Exploration , 2006, 2006 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[8]  Doug Burger,et al.  Measuring Experimental Error in Microprocessor Simulation , 2001, ISCA 2001.

[9]  Reinhold Weicker,et al.  Dhrystone: a synthetic systems programming benchmark , 1984, CACM.

[10]  Nikil D. Dutt,et al.  Rapid exploration of pipelined processors through automatic generation of synthesizable RTL models , 2003, 14th IEEE International Workshop on Rapid Systems Prototyping, 2003. Proceedings..

[11]  K. Kennedy,et al.  Evaluation of a Workflow Scheduler Using Integrated Performance Modelling and Batch Queue Wait Time Prediction , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[12]  Michael F. P. O'Boyle,et al.  Automatic performance model construction for the fast software exploration of new hardware designs , 2006, CASES '06.

[13]  John Paul Shen,et al.  Calibration of Microprocessor Performance Models , 1998, Computer.

[14]  Tao Li,et al.  Accelerating multi-core processor design space evaluation using automatic multi-threaded workload synthesis , 2008, 2008 IEEE International Symposium on Workload Characterization.