Implementing global memory management in a workstation cluster

Advances in network and processor technology have greatly changed the communication and computational power of local-area workstation clusters. However, operating systems still treat workstation clusters as a collection of loosely-connected processors, where each workstation acts as an autonomous and independent agent. This operating system structure makes it difficult to exploit the characteristics of current clusters, such as low-latency communication, huge primary memories, and high-speed processors, in order to improve the performance of cluster applications. This paper describes the design and implementation of global memory management in a workstation cluster. Our objective is to use a single, unified, but distributed memory management algorithm at the lowest level of the operating system. By managing memory globally at this level, all system- and higher-level software, including VM, file systems, transaction systems, and user applications, can benefit from available cluster memory. We have implemented our algorithm in the OSF/1 operating system running on an ATM-connected cluster of DEC Alpha workstations. Our measurements show that on a suite of memory-intensive programs, our system improves performance by a factor of 1.5 to 3.5. We also show that our algorithm has a performance advantage over others that have been proposed in the past.

[1]  Michael Stonebraker,et al.  Operating system support for database management , 1981, CACM.

[2]  Paul J. Leach,et al.  The Architecture of an Integrated Local Network , 1983, IEEE J. Sel. Areas Commun..

[3]  Nancy P. Kronenberg,et al.  VAXclusters: A Closely-Coupled Distributed System (Abstract). , 1985, SOSP 1985.

[4]  Edward D. Lazowska,et al.  Adaptive load sharing in homogeneous distributed systems , 1986, IEEE Transactions on Software Engineering.

[5]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1986, PODC '86.

[6]  Michael N. Nelson,et al.  Caching in the Sprite network file system , 1988, TOCS.

[7]  David L. Black,et al.  Competitive algorithms for replication and migration problems , 1989 .

[8]  Robert J. Fowler,et al.  The implementation of a coherent memory abstraction on a NUMA multiprocessor: experiences with platinum , 1989, SOSP '89.

[9]  Jim Griffioen,et al.  A New Design for Distributed Systems: The Remote Memory Model , 1990, USENIX Summer.

[10]  Michael Burrows,et al.  Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links , 1991, IEEE J. Sel. Areas Commun..

[11]  Bill N. Schilit,et al.  Adaptive Remote Paging for Mobile Computers , 1991 .

[12]  Miron Livny,et al.  Global Memory Management in Client-Server Database Architectures , 1992, VLDB.

[13]  M. Franklin,et al.  Global Memory Management in Client-Server DBMS Architectures , 1992 .

[14]  Thomas E. Anderson,et al.  High speed switch scheduling for local area networks , 1992, ASPLOS V.

[15]  Anoop Gupta,et al.  Comparative Performance Evaluation of Cache-Coherent NUMA and COMA Architectures , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[16]  Marshall W. Bern,et al.  On-line algorithms for cache sharing , 1993, STOC.

[17]  David J. DeWitt,et al.  The 007 Benchmark , 1993, SIGMOD '93.

[18]  David J. DeWitt,et al.  The oo7 Benchmark , 1993, SIGMOD Conference.

[19]  Marek Chrobak,et al.  Page Migration Algorithms Using Work Functions , 1993, ISAAC.

[20]  Michael Dahlin,et al.  Cooperative caching: using remote client memory to improve file system performance , 1994, OSDI '94.

[21]  Jeffery R. Westbrook Randomized Algorithms for Multiprocessor Page Migration , 1994, SIAM J. Comput..

[22]  Richard L. Sites,et al.  Alpha Architecture Reference Manual , 1995 .

[23]  David Salesin,et al.  Fast Rendering of Complex Environments Using a Spatial Hierarchy , 1996, Graphics Interface.