Bias scheduling in heterogeneous multi-core architectures

Heterogeneous architectures that integrate a mix of big and small cores are very attractive because they can achieve high single-threaded performance while enabling high performance thread-level parallelism with lower energy costs. Despite their benefits, they pose significant challenges to the operating system software. Thread scheduling is one of the most critical challenges. In this paper we propose bias scheduling for heterogeneous systems with cores that have different microarchitectures and performance.We identify key metrics that characterize an application bias, namely the core type that best suits its resource needs. By dynamically monitoring application bias, the operating system is able to match threads to the core type that can maximize system throughput. Bias scheduling takes advantage of this by influencing the existing scheduler to select the core type that bests suits the application when performing load balancing operations. Bias scheduling can be implemented on top of most existing schedulers since its impact is limited to changes in the load balancing code. In particular, we implemented it over the Linux scheduler on a real system that models microarchitectural differences accurately and found that it can improve system performance significantly, and in proportion to the application bias diversity present in the workload. Unlike previous work, bias scheduling does not require sampling of CPI on all core types or offline profiling. We also expose the limits of dynamic voltage/frequency scaling as an evaluation vehicle for heterogeneous systems.

[1]  Dean M. Tullsen,et al.  Exploiting unbalanced thread scheduling for energy and performance on a CMP of SMT processors , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[2]  Michael Gschwind The Cell Broadband Engine: Exploiting Multiple Levels of Parallelism in a Chip Multiprocessor , 2007, International Journal of Parallel Programming.

[3]  Hridesh Rajan,et al.  Predictive thread-to-core assignment on a heterogeneous multi-core processor , 2007, PLOS '07.

[4]  James E. Smith,et al.  A Top-Down Approach to Architecting CPI Component Performance Counters , 2007, IEEE Micro.

[5]  Tong Li,et al.  Using OS Observations to Improve Performance in Multicore Systems , 2008, IEEE Micro.

[6]  Milind Girkar,et al.  EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system , 2007, PLDI '07.

[7]  Intel,et al.  Inside Intel ® Core TM Microarchitecture Setting New Standards for Energy-Efficient Performance , 2006 .

[8]  Stacey Jeffery,et al.  HASS: a scheduler for heterogeneous multicore systems , 2009, OPSR.

[9]  Ravi Rajwar,et al.  The impact of performance asymmetry in emerging multicore architectures , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[10]  Tong Li,et al.  Efficient operating system scheduling for performance-asymmetric multi-core architectures , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[11]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[12]  John Paul Shen,et al.  Mitigating Amdahl's law through EPI throttling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[13]  Soraya Ghiasi,et al.  Scheduling for heterogeneous processors in server systems , 2005, CF '05.

[14]  John Paul Shen,et al.  Multiple Instruction Stream Processor , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[15]  Patrick Crowley,et al.  Dynamic thread assignment on heterogeneous multiprocessor architectures , 2006, CF '06.

[16]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures: the potential for processor power reduction , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[17]  Michael Stumm,et al.  Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors , 2007, EuroSys '07.

[18]  Landon P. Cox,et al.  The Impact of Dynamically Heterogeneous Multicore Processors on Thread Scheduling , 2008, IEEE Micro.