Fast Design Space Exploration Using Local Regression Modeling With Application to ASIPs

The configuration of an application-specific instruction-set processor through an exhaustive search of the design space is computationally prohibitive. Consequently, we propose a novel algorithm that models the design space using local regression statistics. With only a small subset of the design space sampled, our model uses statistical inference to estimate all remaining points. This technique enables existing design space exploration approaches to make longer strides toward the optimal point while evaluating fewer points in the design space. We tested our approach on two important aspects of processor architecture. Initially, we optimized the pattern history table (PHT) of a GSelect branch predictor to minimize the total energy of an embedded processor. Our approach was able to find the optimal configuration for the majority of benchmarks tested. By configuring the PHT size using our approach, the total processor energy was reduced by 17.2% on average, which is close to the possible percentage of 17.6% using optimal configurations. We then extended our approach to a multidimensional cache tuning problem where we configured a two-level cache hierarchy with 19 278 possible configurations. In this case, only 1% of the design space was simulated, resulting in a 100 times speedup. In doing so, we were able to identify near optimal configurations for most benchmarks and reduce the overall energy of the processor by 13.9% on average, with one benchmark by as much as 53%.

[1]  Kurt Keutzer,et al.  Building ASIPs: The Mescal Methodology , 2006 .

[2]  Bill Moyer,et al.  A low power unified cache architecture providing power and performance flexibility , 2000, ISLPED'00: Proceedings of the 2000 International Symposium on Low Power Electronics and Design (Cat. No.00TH8514).

[3]  Resve A. Saleh,et al.  Fast configuration of an energy-efficient branch predictor , 2006, IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures (ISVLSI'06).

[4]  Trevor N. Mudge,et al.  Correlation and Aliasing in Dynamic Branch Predictors , 1996, ISCA.

[5]  A. Seznec,et al.  Trading Conflict And Capacity Aliasing In Conditional Branch Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[6]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[7]  Frank Vahid,et al.  Cache configuration exploration on prototyping platforms , 2003, 14th IEEE International Workshop on Rapid Systems Prototyping, 2003. Proceedings..

[8]  Kevin Skadron,et al.  Power issues related to branch prediction , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[9]  Masaharu Imai,et al.  An integrated design environment for application specific integrated processor , 1991, [1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[10]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[11]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[12]  Guohua Pan,et al.  Local Regression and Likelihood , 1999, Technometrics.

[13]  B. Ramakrishna Rau,et al.  PICO: Automatically Designing Custom Computers , 2002, Computer.

[14]  Trevor Mudge,et al.  Challenges for architectural level power modeling , 2002 .

[15]  Joseph T. Rahmeh,et al.  Improving the accuracy of dynamic branch prediction using branch correlation , 1992, ASPLOS V.

[16]  Kiyoung Choi,et al.  Configurable Processors for Embedded Computing , 2003, Computer.

[17]  M. Wand Local Regression and Likelihood , 2001 .