Adaptive multivariate regression for advanced memory system evaluation: application and experience

Abstract Recent advances in latency hiding techniques have made performance evaluation of memory hierarchies a more difficult task. Applications compiled for a particular architecture may be executed on vastly different memory hierarchy implementations. There is a need for performance analysis techniques that provide methods for understanding the interaction between applications and a given memory hierarchy. In this paper, we present a statistical approach to performance analysis of advanced memory hierarchy implementations. The method involves the utilization of previously available statistical analysis techniques coupled with scalability analysis. The result is a novel step-wise approach to understanding the hierarchical memory performance of scientific applications. We apply the method to several scientific applications of interest to the accelerated strategic computing initiative (ASCI) over the SGI machines PowerChallenge and Origin 2000. Results indicate some codes are statistically identical in memory performance, while others vary greatly. Furthermore, some codes do not take advantage of the performance enhancements to the memory system found in the Origin 2000.

[1]  Kirk W. Cameron,et al.  A Statistical-Empirical Hybrid Approach to Hierarchical Memory Analysis , 2000, Euro-Par.

[2]  Yong Luo,et al.  An empirical hierarchical memory model based on hardware performance counters , 1998 .

[3]  Xian-He Sun,et al.  Scalability of Parallel Algorithm-Machine Combinations , 1994, IEEE Trans. Parallel Distributed Syst..

[4]  R. F. Brown,et al.  PERFORMANCE EVALUATION , 2019, ISO 22301:2019 and business continuity management – Understand how to plan, implement and enhance a business continuity management system (BCMS).

[5]  Raghu Kacker,et al.  A scalability test for parallel code , 1995, Softw. Pract. Exp..

[6]  Jean-Loup Baer,et al.  On the use of trace sampling for architectural studies of desktop applications , 1999, SIGMETRICS '99.

[7]  Kenneth C. Yeager The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.

[8]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[9]  Eugene Miya,et al.  Machine Characterization Based on an Abstract High-level Language Machine , 1990, PERV.

[10]  W. J. Langford Statistical Methods , 1959, Nature.

[11]  Isaac D. Scherson,et al.  Micro-architecture evaluation using performance vectors , 1996, SIGMETRICS '96.

[12]  Leonard J. Shustek,et al.  An instruction timing model of CPU performance , 1977, ISCA '77.

[13]  James E. Smith,et al.  Modeling superscalar processors via statistical simulation , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[14]  Phillip Ein-Dor,et al.  Attributes of the performance of central processing units: a relative performance prediction model , 1987, CACM.

[15]  Yong Luo,et al.  Instruction-Level Microprocessor Modeling of Scientific Applications , 1999, ISHPC.

[16]  Alan Jay Smith,et al.  Machine Characterization Based on an Abstract High-Level Language Machine , 1989, IEEE Trans. Computers.

[17]  Edward S. Davidson,et al.  Approaching a machine-application bound in delivered performance on scientific code , 1993, Proc. IEEE.