Performance evaluation of a cluster-based multiprocessor built from ATM switches and bus-based multiprocessor servers

We consider a network of workstations (NOW) organization consisting of a number of bus-based multiprocessor servers interconnected by an ATM switch. A shared-memory model is supported by distributed virtual shared memory (DVSM) and this paper focuses on the access penalties incurred by (1) ATM and (2) the DVSM software. First, through detailed architectural simulations we find that while the bandwidth and the latency of the ATM switch fabrics are found to be acceptable, the latency incurred by commercially available ATM interfaces has a first order effect on the performance. We also study the effects of various scheduling policies for the coherence handlers. Our data suggest that since the probability of finding an idle processor within a cluster is high, a good policy is to schedule it there instead of letting an extra compute processor execute coherence handlers. Overall, by adjusting the adaptation layer of ATM to a DVSM system we find that ATM is a promising technology for these kinds of systems.

[1]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[2]  James R. Larus,et al.  Tempest and typhoon: user-level shared memory , 1994, ISCA '94.

[3]  James C. Hoe,et al.  START-NG: Delivering Seamless Parallel Computing , 1995, Euro-Par.

[4]  Donald Yeung,et al.  The MIT Alewife machine: architecture and performance , 1995, ISCA '98.

[5]  Håkan Grahn,et al.  Efficient strategies for software-only protocols in shared-memory multiprocessors , 1995, ISCA.

[6]  Per Stenström,et al.  The Cachemire Test Bench A Flexible And Effective Approach For Simulation Of Multiprocessors , 1993, [1993] Proceedings 26th Annual Simulation Symposium.

[7]  David Hung-Chang Du,et al.  Distributed network computing over local ATM networks , 1994, Supercomputing '94.

[8]  Jean-Yves Le Boudec,et al.  The Asynchronous Transfer Mode: A Tutorial , 1992, Comput. Networks ISDN Syst..

[9]  Alan L. Cox,et al.  Software versus hardware shared-memory implementation: a case study , 1994, ISCA '94.

[10]  Anoop Gupta,et al.  The Stanford Dash multiprocessor , 1992, Computer.

[11]  Eric Williams,et al.  Performance optimizations, implementation, and verification of the SGI Challenge multiprocessor , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[12]  David M. Fenwick,et al.  The AlphaServer 8000 Series: High-end Server Platform Development , 1995, Digit. Tech. J..

[13]  Alan L. Cox,et al.  Evaluation of release consistent software distributed shared memory on emerging network technology , 1993, ISCA '93.

[14]  H. Grahn,et al.  Efficient strategies for software-only directory protocols in shared-memory multiprocessors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[15]  Effectiveness of Dynamic Prefetching in Multiple-WriterDistributed Virtual Shared Memory Systems , 1997 .

[16]  David H. C. Du,et al.  Distributed network computing over local ATM networks , 1994, Proceedings of Supercomputing '94.

[17]  Alan L. Cox,et al.  Lazy release consistency for software distributed shared memory , 1992, ISCA '92.

[18]  Henry M. Levy,et al.  Efficient Support for Multicomputing on ATM Networks , 1993 .

[19]  Anoop Gupta,et al.  The Stanford FLASH multiprocessor , 1994, ISCA '94.

[20]  Ben J. Catanzaro,et al.  Multiprocessor System Architectures , 1994 .