The Yin and Yang of power and performance for asymmetric hardware and managed software

On the hardware side, asymmetric multicore processors present software with the challenge and opportunity of optimizing in two dimensions: performance and power. Asymmetric multicore processors (AMP) combine general-purpose big (fast, high power) cores and small (slow, low power) cores to meet power constraints. Realizing their energy efficiency opportunity requires workloads with differentiated performance and power characteristics.

[1]  Norman P. Jouppi,et al.  Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction , 2003, MICRO.

[2]  Mahmut T. Kandemir,et al.  Tuning garbage collection in an embedded Java environment , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[3]  Dheeraj Reddy,et al.  Bias scheduling in heterogeneous multi-core architectures , 2010, EuroSys '10.

[4]  Richard D. Greenblatt,et al.  A LISP machine , 1974, CAW '80.

[5]  Matthias Meyer,et al.  Mark-sweep or copying?: a "best of both worlds" algorithm and a hardware-supported real-time implementation , 2007, ISMM '07.

[6]  Onur Mutlu,et al.  Accelerating critical section execution with asymmetric multi-core architectures , 2009, ASPLOS.

[7]  Taiichi Yuasa,et al.  Real-time garbage collection on general-purpose machines , 1990, J. Syst. Softw..

[8]  Matthias Meyer,et al.  Exploiting the efficiency of generational algorithms for hardware-supported real-time garbage collection , 2007, SAC '07.

[9]  Tong Li,et al.  Operating system support for overlapping-ISA heterogeneous multi-core architectures , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[10]  Matthew Arnold,et al.  Adaptive optimization in the Jalapeño JVM , 2000, OOPSLA '00.

[11]  Kunle Olukotun,et al.  The Future of Microprocessors , 2005, ACM Queue.

[12]  Xi Yang,et al.  Looking back on the language and hardware revolutions: measured power, performance, and scaling , 2011, ASPLOS XVI.

[13]  Wolfgang Nebel,et al.  Integrated Circuit and System Design. Power and Timing Modeling, Optimization and Simulation , 2012, Lecture Notes in Computer Science.

[14]  Manuel Prieto,et al.  Leveraging workload diversity through OS scheduling to maximize performance on single-ISA heterogeneous multicore systems , 2011, J. Parallel Distributed Comput..

[15]  Matthias Meyer,et al.  Fine-Grained Parallel Compacting Garbage Collection through Hardware-Supported Synchronization , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[16]  Perry Cheng,et al.  Myths and realities: the performance impact of garbage collection , 2004, SIGMETRICS '04/Performance '04.

[17]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[18]  Etienne Le Sueur An analysis of the effectiveness of energy management on modern computer processors , 2011 .

[19]  Uri C. Weiser,et al.  Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors , 2006, IEEE Computer Architecture Letters.

[20]  Kathryn S. McKinley,et al.  Dynamic SimpleScalar: Simulating Java Virtual Machines , 2003 .

[21]  Matthew Arnold,et al.  A concurrent dynamic analysis framework for multicore hardware , 2009, OOPSLA.

[22]  No License,et al.  Intel ® 64 and IA-32 Architectures Software Developer ’ s Manual Volume 3 A : System Programming Guide , Part 1 , 2006 .

[23]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[24]  Steven Swanson,et al.  Conservation cores: reducing the energy of mature computations , 2010, ASPLOS XV.

[25]  Xi Yang,et al.  Why nothing matters: the impact of zeroing , 2011, OOPSLA '11.

[26]  Kathryn S. McKinley,et al.  Microarchitectural Characterization of Production JVMs and Java Workloads , 2008 .

[27]  Rahul Khanna,et al.  RAPL: Memory power estimation and capping , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[28]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[29]  Francisco Tirado,et al.  Energy Characterization of Garbage Collectors for Dynamic Applications on Embedded Systems , 2005, PATMOS.

[30]  Richard E. Brown,et al.  Report to Congress on Server and Data Center Energy Efficiency: Public Law 109-431 , 2008 .

[31]  Kathryn S. McKinley,et al.  Immix: a mark-region garbage collector with space efficiency, fast collection, and mutator performance , 2008, PLDI '08.

[32]  Amer Diwan,et al.  Energy Consumption and Garbage Collection in Low-Powered Computing ; CU-CS-930-02 , 2002 .