Performance experiences on Sun's Wildfire prototype

This paper presents performance results from work done on Sun’s WildFire system. WildFire is a codename for a prototype shared memory multiprocessor developed by Sun MicrosystemsTM consisting of up to four unmodified Sun EnterpriseTM x000 series symmetric multiprocessors (SMPs). A goal of the WildFire system is to evaluate the effectiveness of leveraging large SMPs in the construction of even larger systems. We have conducted several performance experiments with a shared memory parallelized finite difference solver. Our work demonstrates the key features of the WildFire system, including automatic page migration and read/write replication. Our results show that the dynamic page migration algorithms used by the WildFire system are effective in automatically optimizing data placement at runtime. Performance comparisons between the WildFire system and currently available SMPs show that the system exhibits good scalability characteristics, and actually outperforms SMPs on this particular application.

[1]  David J. Evans,et al.  Parallel S.O.R. iterative methods , 1984, Parallel Comput..

[2]  Anoop Gupta,et al.  Scheduling and page migration for multiprocessor compute servers , 1994, ASPLOS VI.

[3]  Jaswinder Pal Singh,et al.  Scaling application performance on a cache-coherent multiprocessor , 1999, ISCA.

[4]  M. Martonosi,et al.  Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[5]  Anoop Gupta,et al.  The DASH prototype: implementation and performance , 1992, ISCA '92.

[6]  Carl Staelin,et al.  lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.

[7]  Erik Hagersten,et al.  Gigaplane: A High Performance Bus for Large SMPs , 2003 .

[8]  D. Lenoski,et al.  The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[9]  John B. Carter,et al.  An argument for simple COMA , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[10]  Anoop Gupta,et al.  Operating system support for improving data locality on CC-NUMA compute servers , 1996, ASPLOS VII.

[11]  Thorsten von Eicken,et al.  技術解説 IEEE Computer , 1999 .

[12]  Erik Hagersten,et al.  Simple COMA node implementations , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[13]  John S. Keen,et al.  Measuring Memory Hierarchy Performance of Cache-Coherent Multiprocessors Using Micro Benchmarks , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[14]  Jaswinder Pal Singh,et al.  A methodology and an evaluation of the SGI Origin2000 , 1998, SIGMETRICS '98/PERFORMANCE '98.

[15]  J.P. Singh,et al.  Scaling application performance on a cache-coherent multiprocessors , 1999, Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367).

[16]  R. J. van der Pas On the Vectorization and Parallelization of a Finite Difference Scheme , 1992 .

[17]  Aad J. van der Steen The benchmark of the EuroBen group , 1991, Parallel Comput..

[18]  D.A. Wood,et al.  Reactive NUMA: A Design For Unifying S-COMA And CC-NUMA , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[19]  T. Lovett,et al.  STiNG: A CC-NUMA Computer System for the Commercial Marketplace , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[20]  Erik Hagersten,et al.  WildFire: a scalable path for SMPs , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[21]  Josep Torrellas,et al.  Cache-Only Memory Architectures , 1999, Computer.

[22]  Anoop Gupta,et al.  Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors , 1998, ISCA.

[23]  Anoop Gupta,et al.  The Stanford FLASH multiprocessor , 1994, ISCA '94.

[24]  Erik Hagersten,et al.  DDM - A Cache-Only Memory Architecture , 1992, Computer.

[25]  Alan E. Charlesworth,et al.  Starfire: extending the SMP envelope , 1998, IEEE Micro.