How to Evaluate Various Commonly Used Program Classification Methods

[1]  Rizwana Begum,et al.  Energy-Performance Trade-offs on Energy-Constrained Devices with Multi-component DVFS , 2015, 2015 IEEE International Symposium on Workload Characterization.

[2]  A. Mericas,et al.  Workload characterization for the design of future servers , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..

[3]  Hashemi Milad,et al.  Continuous runahead: Transparent hardware acceleration for memory intensive workloads , 2016 .

[4]  Daisuke Takahashi,et al.  The HPC Challenge (HPCC) benchmark suite , 2006, SC.

[5]  Erich Strohmaier,et al.  A genetic algorithms approach to modeling the performance of memory-bound computations , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[6]  Anna Sikora,et al.  Dynamic Tuning of OpenMP Memory Bound Applications in Multisocket Systems using MATE , 2018, ICPP Workshops.

[7]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[8]  Bharadwaj Veeravalli,et al.  Guest Editors' Introduction: Special Issue on Cloud of Clouds , 2014, IEEE Trans. Computers.

[9]  S. Huang,et al.  Energy-Efficient Cluster Computing via Accurate Workload Characterization , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[10]  Yong Dong,et al.  A holistic energy-efficient approach for a processor-memory system , 2019, Tsinghua Science and Technology.

[11]  Chen Cui,et al.  Analyzing time-dimension communication characterizations for representative scientific applications on supercomputer systems , 2019, Frontiers of Computer Science.

[12]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[13]  Peter J. Denning,et al.  The working set model for program behavior , 1968, CACM.

[14]  Huiquan Wang,et al.  Lazy scheduling based disk energy optimization method , 2020 .

[15]  Sung Woo Chung,et al.  Leveraging Process Variation for Performance and Energy: In the Perspective of Overclocking , 2014, IEEE Transactions on Computers.

[16]  Rong Ge,et al.  Application-Aware Power Coordination on Power Bounded NUMA Multicore Systems , 2017, 2017 46th International Conference on Parallel Processing (ICPP).

[17]  Huiyang Zhou,et al.  Enhancing Memory-Level Parallelism via Recovery-Free Value Prediction , 2005, IEEE Trans. Computers.

[18]  Onur Mutlu,et al.  Continuous runahead: Transparent hardware acceleration for memory intensive workloads , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Ricardo Bianchini,et al.  Using communication-to-computation ratio in parallel program design and performance prediction , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.

[20]  Margaret Martonosi,et al.  Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[21]  Margaret Martonosi,et al.  A dynamic compilation framework for controlling microprocessor energy and performance , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).

[22]  Yale N. Patt,et al.  Filtered runahead execution with a runahead buffer , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[23]  Yannis Cotronis,et al.  A Practical Performance Model for Compute and Memory Bound GPU Kernels , 2015, 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[24]  Dmitry V. Ponomarev,et al.  Two-Level Reorder Buffers: Accelerating Memory-Bound Applications on SMT Architectures , 2008, 2008 37th International Conference on Parallel Processing.