Corey: An Operating System for Many Cores

Multiprocessor application performance can be limited by the operating system when the application uses the operating system frequently and the operating system services use data structures shared and modified by multiple processing cores. If the application does not need the sharing, then the operating system will become an unnecessary bottleneck to the application's performance. This paper argues that applications should control sharing: the kernel should arrange each data structure so that only a single processor need update it, unless directed otherwise by the application. Guided by this design principle, this paper proposes three operating system abstractions (address ranges, kernel cores, and shares) that allow applications to control inter-core sharing and to take advantage of the likely abundance of cores by dedicating cores to specific operating system functions. Measurements of microbenchmarks on the Corey prototype operating system, which embodies the new abstractions, show how control over sharing can improve performance. Application benchmarks, using MapReduce and a Web server, show that the improvements can be significant for overall performance: MapReduce on Corey performs 25% faster than on Linux when using 16 cores. Hardware event counters confirm that these improvements are due to avoiding operations that are expensive on multicore machines.

[1]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[2]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[3]  Anne Rogers,et al.  Software caching and computation migration in Olden , 1995, PPOPP '95.

[4]  Anoop Gupta,et al.  Hive: fault containment for shared-memory multiprocessors , 1995, SOSP.

[5]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[6]  Anoop Gupta,et al.  Operating system support for improving data locality on CC-NUMA compute servers , 1996, ASPLOS VII.

[7]  Wilson C. Hsieh,et al.  Dynamic Computation Migration in DSM Systems , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[8]  Scott Devine,et al.  Disco: running commodity operating systems on scalable multiprocessors , 1997, TOCS.

[9]  Michael Stumm,et al.  Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system , 1999, OSDI '99.

[10]  Mendel Rosenblum,et al.  Cellular disco: resource management using virtual clusters on shared-memory multiprocessors , 2000, TOCS.

[11]  Jamal Hadi Salim,et al.  Beyond Softnet , 2001, Annual Linux Showcase & Conference.

[12]  David E. Culler,et al.  SEDA: An Architecture for Scalable, Well-Conditioned Internet Services , 2001 .

[13]  HarrisTim,et al.  Xen and the art of virtualization , 2003 .

[14]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[15]  Understanding the Linux 2.6.8.1 CPU Scheduler , 2005 .

[16]  Keshav Pingali,et al.  Automatic measurement of memory hierarchy parameters , 2005, SIGMETRICS '05.

[17]  Dimitrios S. Nikolopoulos,et al.  Scalable locality-conscious multithreaded memory allocation , 2006, ISMM '06.

[18]  Eddie Kohler,et al.  Making information flow explicit in HiStar , 2006, OSDI '06.

[19]  Bratin Saha,et al.  Enabling scalability and performance in a large scale CMP environment , 2007, EuroSys '07.

[20]  Guy E. Blelloch,et al.  Scheduling threads for constructive cache sharing on CMPs , 2007, SPAA '07.

[21]  Michael Stumm,et al.  Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors , 2007, EuroSys '07.

[22]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[23]  Bryan Veal,et al.  Performance scalability of a multi-core web server , 2007, ANCS '07.

[24]  Anant Agarwal,et al.  The KILL Rule for Multicore , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[25]  Dilma Da Silva,et al.  Experience distributing objects in an SMMP OS , 2007, TOCS.

[26]  Adrian Schüpbach,et al.  Embracing diversity in the Barrelfish manycore operating system , 2008 .

[27]  Christoforos E. Kozyrakis,et al.  Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[28]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[29]  Sangjin Han,et al.  PacketShader: a GPU-accelerated software router , 2010, SIGCOMM '10.

[30]  Michael Stumm,et al.  FlexSC: Flexible System Call Scheduling with Exception-Less System Calls , 2010, OSDI.

[31]  Corey Gough,et al.  Kernel Scalability — Expanding the Horizon Beyond Fine Grain Locks , 2010 .

[32]  Robert Tappan Morris,et al.  An Analysis of Linux Scalability to Many Cores , 2010, OSDI.

[33]  Supra-linear Packet Processing Performance with Intel ® Multi-core Processors , 2022 .