PROFET: Modeling System Performance and Energy Without Simulating the CPU

Application performance on novel memory systems is typically estimated using a hardware simulator. The simulation is, however, time consuming, which limits the number of design options that can be explored within a practical length of time. Also, although memory simulators are typicallywell validated, current CPU simulators have various shortcomings, such as simplified out-of-order execution, an obsolete data prefetcher and a lack of virtual-to-physical memory translation, all of which can make a huge difference between the simulated and actual memory system.

[1]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  Yen-Chen Liu,et al.  Knights Landing: Second-Generation Intel Xeon Phi Product , 2016, IEEE Micro.

[3]  Gerhard Wellein,et al.  LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[4]  James E. Smith,et al.  A first-order superscalar processor model , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[5]  Onur Mutlu,et al.  Ramulator: A Fast and Extensible DRAM Simulator , 2016, IEEE Computer Architecture Letters.

[6]  James E. Smith,et al.  A performance counter architecture for computing accurate CPI components , 2006, ASPLOS XII.

[7]  Qingyuan Deng,et al.  MemScale: active low-power modes for main memory , 2011, ASPLOS XVI.

[8]  James E. Smith,et al.  Advanced Micro Devices , 2005 .

[9]  Brian Fahs,et al.  Microarchitecture optimizations for exploiting memory-level parallelism , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[10]  Philip G. Emma,et al.  Understanding some simple processor-performance limits , 1997, IBM J. Res. Dev..

[11]  Jim Jeffers,et al.  Knights Landing overview , 2016 .

[12]  David A. Patterson,et al.  Computer architecture (2nd ed.): a quantitative approach , 1996 .

[13]  Bruce Jacob,et al.  DRAMSim2: A Cycle Accurate Memory System Simulator , 2011, IEEE Computer Architecture Letters.

[14]  Rommel Sánchez Verdejo,et al.  Microbenchmarks for Detailed Validation and Tuning of Hardware Simulators , 2017, 2017 International Conference on High Performance Computing & Simulation (HPCS).

[15]  Bruce Jacob,et al.  The Memory System: You Can't Avoid It, You Can't Ignore It, You Can't Fake It , 2009, The Memory System: You Can't Avoid It, You Can't Ignore It, You Can't Fake It.

[16]  Thomas Willhalm,et al.  Quantifying the Performance Impact of Memory Latency and Bandwidth for Big Data Workloads , 2015, 2015 IEEE International Symposium on Workload Characterization.

[17]  Stijn Eyerman,et al.  Interval simulation: Raising the level of abstraction in architectural simulation , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[18]  David Black-Schaffer,et al.  Analytical Processor Performance and Power Modeling Using Micro-Architecture Independent Characteristics , 2016, IEEE Transactions on Computers.

[19]  Eduard Ayguadé,et al.  Main memory latency simulation: the missing link , 2018, MEMSYS.

[20]  Avinash Sodani,et al.  Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition 2nd Edition , 2016 .

[21]  Rong Ge,et al.  Power and energy profiling of scientific applications on distributed systems , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[22]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[23]  Bruce Jacob,et al.  Memory Systems: Cache, DRAM, Disk , 2007 .

[24]  Christoforos E. Kozyrakis,et al.  ZSim: fast and accurate microarchitectural simulation of thousand-core systems , 2013, ISCA.

[25]  Sally A. McKee,et al.  Hitting the memory wall: implications of the obvious , 1995, CARN.

[26]  Gerhard Wellein,et al.  LIKWID: Lightweight Performance Tools , 2011, CHPC.