Auto-tuning multi-programmed workload on the SCC

The need for power-aware computing has become increasingly apparent. Common power-aware platforms have placed the burden of optimizing energy consumption on the programmer. In many cases this is a complex task which requires more time from the programmer than is acceptable. Hence, auto-tuning for power-aware computing has been proposed to alleviate the programmer from this task. Previous research has been focusing on automatic tuning of individual applications. However, there has been little work that tunes multiple programs across an entire platform. The Single-Chip Cloud Computer (SCC) is an experimental processor created by Intel Labs. In this paper, we present a method that extends auto-tuning to consider the multi-programmed workload across the entire many-core platform of SCC. Using an algorithm based on Differential Evolution, we were able to reduce the energy-delay product of the workload by 58.5%.

[1]  David K. Lowenthal,et al.  Using multiple energy gears in MPI programs on a power-scalable cluster , 2005, PPoPP.

[2]  Sriram R. Vangal,et al.  A 2 Tb/s 6 × 4 Mesh Network for a Single-Chip Cloud Computer With DVFS in 45 nm CMOS , 2011, VLSIC 2011.

[3]  R. Storn,et al.  Differential Evolution , 2004 .

[4]  Timothy Mattson,et al.  A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[5]  Krisztián Flautner,et al.  Automatic Performance Setting for Dynamic Voltage Scaling , 2001, MobiCom '01.

[6]  Victor Pankratius,et al.  Application Level Automatic Performance Tuning on the Single-Chip Cloud Computer , 2011, MARC Symposium.

[7]  Saurabh Dighe,et al.  The 48-core SCC Processor: the Programmer's View , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[8]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[9]  Wu-chun Feng,et al.  A Power-Aware Run-Time System for High-Performance Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[10]  Sriram R. Vangal,et al.  A 2 Tb/s 6$\,\times\,$ 4 Mesh Network for a Single-Chip Cloud Computer With DVFS in 45 nm CMOS , 2011, IEEE Journal of Solid-State Circuits.

[11]  Wu-chun Feng,et al.  The Green500 List: Encouraging Sustainable Supercomputing , 2007, Computer.

[12]  Chen Liu,et al.  Application-level voltage and frequency tuning of multi-phase program on the SCC , 2013, ADAPT '13.

[13]  Mitsuhisa Sato,et al.  Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.