Multi-level shared caching techniques for scalability in VMP-M/C

The problem of building a scalable shared memory multiprocessor can be reduced to that of building a scalable memory hierarchy, assuming interprocessor communication is handled by the memory system. In this paper, we describe the VMP-MC design, a distributed parallel multi-computer based on the VMP multiprocessor design, that is intended to provide a set of building blocks for configuring machines from one to several thousand processors. VMP-MC uses a memory hierarchy based on shared caches, ranging from on-chip caches to board-level caches connected by busses to, at the bottom, a high-speed fiber optic ring. In addition to describing the building block components of this architecture, we identify the key performance issues associated with the design and provide performance evaluation of these issues using trace-drive simulation and measurements from the VMP. This work was sponsored in part by the Defense Advanced Research Projects Agency under Contract N00014-88-K-0619.

[1]  Abhinav Gupta,et al.  Analysis of cache invalidation patterns in multiprocessors , 1989, ASPLOS 1989.

[2]  Leonard Kleinrock,et al.  Queueing Systems: Volume I-Theory , 1975 .

[3]  Alan Jay Smith,et al.  Cache Memories , 1982, CSUR.

[4]  Paul Feautrier,et al.  A New Solution to Coherence Problems in Multicache Systems , 1978, IEEE Transactions on Computers.

[5]  Richard F. Rashid,et al.  The Integration of Virtual Memory Management and Interprocess Communication in Accent , 1986, ACM Trans. Comput. Syst..

[6]  David R. Cheriton,et al.  The VMP network adapter board (NAB): high-performance network communication for multiprocessors , 1988, SIGCOMM '88.

[7]  Alvin M. Despain,et al.  Multiprocessor cache synchronization: issues, innovations, evolution , 1986, ISCA '86.

[8]  Anoop Gupta,et al.  The VMP multiprocessor: initial experience, refinements, and performance evaluation , 1988, ISCA '88.

[9]  David R. Cheriton,et al.  Software-controlled caches in the VMP multiprocessor , 1986, ISCA 1986.

[10]  Alan Jay Smith,et al.  Line (Block) Size Choice for CPU Cache Memories , 1987, IEEE Transactions on Computers.

[11]  Andrew W. Wilson,et al.  Hierarchical cache/bus architecture for shared memory multiprocessors , 1987, ISCA '87.

[12]  David L. Black,et al.  The duality of memory and communication in the implementation of a multiprocessor operating system , 1987, SOSP '87.

[13]  R.T. Short,et al.  A simulation study of two-level caches , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[14]  Albert Chang,et al.  801 storage: architecture and programming , 1988, TOCS.

[15]  Charles L. Seitz,et al.  The cosmic cube , 1985, CACM.

[16]  David R. Cheriton,et al.  The VMP network adapter board (NAB): high-performance network communication for multiprocessors , 1988, SIGCOMM 1988.

[17]  W. Daniel Hillis,et al.  The connection machine , 1985 .

[18]  James K. Archibald,et al.  An economical solution to the cache coherence problem , 1984, ISCA '84.