Factored operating systems (fos): the case for a scalable operating system for multicores

The next decade will afford us computer chips with 100's to 1,000's of cores on a single piece of silicon. Contemporary operating systems have been designed to operate on a single core or small number of cores and hence are not well suited to manage and provide operating system services at such large scale. If multicore trends continue, the number of cores that an operating system will be managing will continue to double every 18 months. The traditional evolutionary approach of redesigning OS subsystems when there is insufficient parallelism will cease to work because the rate of increasing parallelism will far outpace the rate at which OS designers will be capable of redesigning subsystems. The fundamental design of operating systems and operating system data structures must be rethought to put scalability as the prime design constraint. This work begins by documenting the scalability problems of contemporary operating systems. These studies are used to motivate the design of a factored operating system (fos). fos is a new operating system targeting manycore systems with scalability as the primary design constraint, where space sharing replaces time sharing to increase scalability.We describe fos, which is built in a message passing manner, out of a collection of Internet inspired services. Each operating system service is factored into a set of communicating servers which in aggregate implement a system service. These servers are designed much in the way that distributed Internet services are designed, but instead of providing high level Internet services, these servers provide traditional kernel services and replace traditional kernel data structures in a factored, spatially distributed manner. fos replaces time sharing with space sharing. In other words, fos's servers are bound to distinct processing cores and by doing so do not fight with end user applications for implicit resources such as TLBs and caches. We describe how fos's design is well suited to attack the scalability challenge of future multicores and discuss how traditional application-operating systems interfaces can be redesigned to improve scalability.

[1]  Robbert van Renesse,et al.  Using Sparse Capabilities in a Distributed Operating System , 1986, ICDCS.

[2]  William J. Bolosky,et al.  Mach: A New Kernel Foundation for UNIX Development , 1986, USENIX Summer.

[3]  Andrew R. Cherenson,et al.  The Sprite network operating system , 1988, Computer.

[4]  Mark Horowitz,et al.  Cache performance of operating system and multiprogramming workloads , 1988, TOCS.

[5]  Partha Dasgupta,et al.  The Design and Implementation of the Clouds Distributed Operating System , 1989, Comput. Syst..

[6]  Mahadev Satyanarayanan,et al.  Scalable, secure, and highly available distributed file access , 1990, Computer.

[7]  Robbert van Renesse,et al.  The Amoeba distributed operating system - A status report , 1991, Comput. Commun..

[8]  Jochen Liedtke,et al.  On micro-kernel construction , 1995, SOSP.

[9]  Anoop Gupta,et al.  Hive: fault containment for shared-memory multiprocessors , 1995, SOSP.

[10]  Scott Devine,et al.  Disco: running commodity operating systems on scalable multiprocessors , 1997, TOCS.

[11]  Vivek Sarkar,et al.  Baring It All to Software: Raw Machines , 1997, Computer.

[12]  G.E. Moore,et al.  Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.

[13]  Michael Stumm,et al.  Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system , 1999, OSDI '99.

[14]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[15]  Vikas Agarwal,et al.  Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[16]  Mendel Rosenblum,et al.  Cellular disco: resource management using virtual clusters on shared-memory multiprocessors , 2000, TOCS.

[17]  Dilma Da Silva,et al.  Enabling Scalable Performance for General Purpose Workloads on Shared Memory Multiprocessors , 2003 .

[18]  GhemawatSanjay,et al.  The Google file system , 2003 .

[19]  Henry Hoffmann,et al.  Evaluation of the Raw microprocessor: an exposed-wire-delay architecture for ILP and streams , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[20]  Dilma Da Silva,et al.  Experience with K42, an open-source, Linux-compatible, scalable operating-system kernel , 2005, IBM Syst. J..

[21]  Henry Hoffmann,et al.  On-Chip Interconnection Architecture of the Tile Processor , 2007, IEEE Micro.

[22]  Saurabh Dighe,et al.  An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[23]  HoffmannHenry,et al.  On-Chip Interconnection Architecture of the Tile Processor , 2007 .

[24]  Yang Zhang,et al.  Corey: An Operating System for Many Cores , 2008, OSDI.