Comprehensive multivariate extrapolation modeling of multiprocessor cache miss rates

Cache miss rates are an important subset of system model inputs. Cache miss rate models are used for broad design space exploration in which many cache configurations cannot be simulated directly due to limitations of trace collection setups or available resources. Often it is not practical to simulate large caches. Large processor counts and consequent potentially high degree of cache sharing are frequently not reproducible on small existing systems. In this article, we present an approach to building multivariate regression models for predicting cache miss rates beyond the range of collectible data. The extrapolation model attempts to accurately estimate the high-level trend of the existing data, which can be extended in a natural way. We extend previous work by its applicability to multiple miss rate components and its ability to model a wide range of cache parameters, including size, line size, associativity and sharing. The stability of extrapolation is recognized to be a crucial requirement. The proposed extrapolation model is shown to be stable to small data perturbations that may be introduced during data collection.We show the effectiveness of the technique by applying it to two commercial workloads. The wide design space contains configurations that are much larger than those for which miss rate data were available. The fitted data match the simulation data very well. The various curves show how a miss rate model is useful for not only estimating the performance of specific configurations, but also for providing insight into miss rate trends.

[1]  Pradip Bose,et al.  Performance Analysis and Its Impact on Design , 1998, Computer.

[2]  Gurindar S. Sohi,et al.  Experience with mean value analysis model for evaluating shared bus, throughput-oriented multiprocessors , 1991, SIGMETRICS '91.

[3]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[4]  Mark Horowitz,et al.  An analytical cache model , 1989, TOCS.

[5]  Richard E. Matick,et al.  Analytical analysis of finite cache penalty and cycles per instruction of a multiprocessor memory hierarchy using miss rates and queuing theory , 2001, IBM J. Res. Dev..

[6]  H. Wynn,et al.  Maximum entropy sampling and optimal Bayesian experimental design , 2000 .

[7]  Alan Jay Smith,et al.  A class of compatible cache consistency protocols and their support by the IEEE futurebus , 1986, ISCA '86.

[8]  Alan Jay Smith,et al.  Measuring Cache and TLB Performance and Their Effect on Benchmark Runtimes , 1995, IEEE Trans. Computers.

[9]  R. Saavedra,et al.  Measuring Cache and TLB Performance and Their Effect on Benchmark Run Times USC-CS-93-546 , 1993 .

[10]  Mary K. Vernon,et al.  AMVA techniques for high service time variability , 2000, SIGMETRICS '00.

[11]  J. Ramsay Monotone Regression Splines in Action , 1988 .

[12]  GluhovskyIlya,et al.  Comprehensive multivariate extrapolation modeling of multiprocessor cache miss rates , 2007 .

[13]  Daniel D. Lee,et al.  Multiplicative Updates for Nonnegative Quadratic Programming in Support Vector Machines , 2002, NIPS.

[14]  David Vengerov,et al.  Constrained Multivariate Extrapolation Models With Application to Computer Cache Rates , 2007, Technometrics.

[15]  Steven Stern,et al.  Feasible Nonparametric Estimation of Multiargument Monotone Functions , 1994 .

[16]  Sharad Malik,et al.  Cache miss equations: an analytical representation of cache misses , 1997, ICS '97.

[17]  Dominique Thiebaut On the Fractal Dimension of Computer Programs and its Application to the Prediction of the Cache Miss Ratio , 1990, PERV.

[18]  Alan Jay Smith,et al.  A class of compatible cache consistency protocols and their support by the IEEE futurebus , 1986, ISCA '86.

[19]  B. Silverman,et al.  Nonparametric regression and generalized linear models , 1994 .

[20]  Paul H. C. Eilers,et al.  Using P-splines to extrapolate two-dimensional Poisson data , 2003 .

[21]  Ilya Gluhovsky,et al.  Comprehensive multiprocessor cache miss rate generation using multivariate models , 2005, TOCS.

[22]  Wen-Hann Wang,et al.  On the inclusion properties for multi-level cache hierarchies , 1988, ISCA '88.

[23]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[24]  K. Kavi Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .

[25]  Mikko H. Lipasti,et al.  A performance methodology for commercial servers , 2000, IBM J. Res. Dev..

[26]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[27]  Ozalp Babaoglu,et al.  ACM Transactions on Computer Systems , 2007 .

[28]  Luiz André Barroso,et al.  Memory system characterization of commercial workloads , 1998, ISCA.

[29]  Berwin A. Turlach,et al.  Constrained Smoothing Splines Revisited , 1997 .

[30]  P. Hall,et al.  NONPARAMETRIC KERNEL REGRESSION SUBJECT TO MONOTONICITY CONSTRAINTS , 2001 .

[31]  T. J. Mitchell,et al.  Bayesian Prediction of Deterministic Functions, with Applications to the Design and Analysis of Computer Experiments , 1991 .

[32]  M.-C. Chiang,et al.  Evaluating Design Choices for Shared Bus Multiprocessors in a Throughput-Oriented Environment , 1992, IEEE Trans. Computers.