Chameleon: operating system support for dynamic processors

The rise of multi-core processors has shifted performance efforts towards parallel programs. However, single-threaded code, whether from legacy programs or ones difficult to parallelize, remains important. Proposed asymmetric multicore processors statically dedicate hardware to improve sequential performance, but at the cost of reduced parallel performance. However, several proposed mechanisms provide the best-of-both-worlds by combining multiple cores into a single, more powerful processor for sequential code. For example, Core Fusion merges multiple cores to pool caches and functional units, and Intel's Turbo Boost raises the clock speed of a core if the other cores on a chip are powered down. These reconfiguration mechanisms have two important properties. First the set of available cores and their capabilities can vary over short time scales. Current operating systems are not designed for rapidly changing hardware: the existing hotplug mechanisms for reconfiguring processors require global operations and hundreds of milliseconds to complete. Second, configurations may be mutually exclusive: using power to speed one core means it cannot be used to speed another. Current schedulers cannot manage this requirement. We present Chameleon, an extension to Linux to support dynamic processors that can reconfigure their cores at runtime. Chameleon provides processor proxies to enable rapid reconfiguration, execution objects to abstract the processing capabilities of physical CPUs, and a cluster scheduler to balance the needs of sequential and parallel programs. In experiments that emulate a dynamic processor, we find that Chameleon can reconfigure processors 100,000 times faster than Linux and allows applications full access to hardware capabilities: sequential code runs at full speed on a powerful execution context, while parallel code runs on as many cores as possible.

[1]  Norman P. Jouppi,et al.  Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction , 2003, MICRO.

[2]  Mark D. Hill,et al.  Amdahl's Law in the Multicore Era , 2008, Computer.

[3]  Gurindar S. Sohi,et al.  Speculative Multithreaded Processors , 2001, Computer.

[4]  Jie Chen,et al.  Analysis and approximation of optimal co-scheduling on Chip Multiprocessors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[5]  James Charles,et al.  Evaluation of the Intel® Core™ i7 Turbo Boost feature , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[6]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[7]  Jaehyuk Huh,et al.  Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture , 2003, ISCA '03.

[8]  John K. Ousterhout,et al.  Scheduling Techniques for Concurrent Systems , 1982, ICDCS.

[9]  Steven Smith Dynamic Scheduling and Resource Management in Heterogeneous Computing Environments with Reconfigurable Hardware , 2006, ERSA.

[10]  Engin Ipek,et al.  Core fusion: accommodating software diversity in chip multiprocessors , 2007, ISCA '07.

[11]  Vanish Talwar,et al.  Using Asymmetric Single-ISA CMPs to Save Energy on Operating Systems , 2008, IEEE Micro.

[12]  CoreTM Microarchitecture Power Management Enhancements in the 45 nm Intel s Core t Microarchitecture , 2008 .

[13]  Christian Bienia,et al.  PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .

[14]  Henry Hoffmann,et al.  Application heartbeats: a generic interface for specifying program performance and goals in autonomous computing environments , 2010, ICAC '10.

[15]  J. Kubiatowicz,et al.  Resource Management in the Tessellation Manycore OS ∗ , 2010 .

[16]  Naomi J. Alpern,et al.  Windows Server 2008 R2 and Windows 7 , 2010 .

[17]  Antonia Zhai,et al.  Energy efficient speculative threads: Dynamic thread allocation in same-ISA heterogeneous multicore systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[18]  Yacine Atif,et al.  Dynamic scheduling techniques for heterogeneous computing systems , 1995, Concurr. Pract. Exp..

[19]  Marco Platzner,et al.  ReconOS: Multithreaded programming for reconfigurable computers , 2009, TECS.

[20]  Ryan E. Grant,et al.  Power-performance efficiency of asymmetric multiprocessors for multi-threaded scientific applications , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[21]  Hessam Kooti,et al.  Transition-aware real-time task scheduling for reconfigurable embedded systems , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[22]  David R. Cheriton,et al.  Borrowed-virtual-time (BVT) scheduling: supporting latency-sensitive threads in a general-purpose scheduler , 1999, OPSR.

[23]  Michael Winston Dales,et al.  Managing a reconfigurable processor in a general purpose workstation environment , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[24]  Larry Rudolph,et al.  Gang Scheduling Performance Benefits for Fine-Grain Synchronization , 1992, J. Parallel Distributed Comput..

[25]  Rajesh Raman,et al.  Matchmaking: distributed resource management for high throughput computing , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[26]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[27]  John K. Ousterhout Scheduling Techniques for Concurrebt Systems. , 1982, ICDCS 1982.

[28]  Manuel Prieto,et al.  A comprehensive scheduler for asymmetric multicore systems , 2010, EuroSys '10.

[29]  Xiao Zhang,et al.  Hardware Execution Throttling for Multi-core Resource Management , 2009, USENIX Annual Technical Conference.

[30]  Shubhendu S. Mukherjee,et al.  Transient fault detection via simultaneous multithreading , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[31]  Marc Tremblay,et al.  Simultaneous speculative threading: a novel pipeline architecture implemented in sun's rock processor , 2009, ISCA '09.

[32]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[33]  Marco Platzner,et al.  Operating systems for reconfigurable embedded platforms: online scheduling of real-time tasks , 2004, IEEE Transactions on Computers.

[34]  Dheeraj Reddy,et al.  Bias scheduling in heterogeneous multi-core architectures , 2010, EuroSys '10.

[35]  Tong Li,et al.  Efficient operating system scheduling for performance-asymmetric multi-core architectures , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[36]  Onur Mutlu,et al.  Accelerating critical section execution with asymmetric multi-core architectures , 2009, ASPLOS.

[37]  Kunle Olukotun,et al.  The Stanford Hydra CMP , 2000, IEEE Micro.

[38]  John Paul Shen,et al.  Mitigating Amdahl's law through EPI throttling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[39]  James E. Smith,et al.  Configurable isolation: building high availability systems with commodity multi-core processors , 2007, ISCA '07.

[40]  Katherine Compton,et al.  Scheduling Intervals for Reconfigurable Computing , 2008, 2008 16th International Symposium on Field-Programmable Custom Computing Machines.

[41]  Manuel Prieto,et al.  Leveraging workload diversity through OS scheduling to maximize performance on single-ISA heterogeneous multicore systems , 2011, J. Parallel Distributed Comput..

[42]  Suresh Siddha Chip Multi Processing aware Linux Kernel Scheduler , 2010 .

[43]  Sally A. McKee,et al.  An approach to resource-aware co-scheduling for CMPs , 2010, ICS '10.

[44]  R. Dixon,et al.  The n-queens problem , 1975, Discret. Math..

[45]  Josep Torrellas,et al.  Hardware and software support for speculative execution of sequential binaries on a chip-multiprocessor , 1998, ICS '98.

[46]  Luiz André Barroso,et al.  Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[47]  Koushik Chakraborty,et al.  Mixed-mode multicore reliability , 2009, ASPLOS.

[48]  Monica S. Lam,et al.  In search of speculative thread-level parallelism , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).