Microservers: a new memory semantics for massively parallel computing

The semantics of memory-a large state which can only be read or changed a small piece at a time-has remained virtually untouched since von Neumann, and its effects-latency and bandwidth-have proved to be the major stumbling block for high performance computing. This paper suggests a new model, termed “microservers,” that exploits “Processing-In- Memory” VLSI technology, and that can reduce latency and memory traffic, increase inherent opportunities for concurrency, and support a variety of highly concurrent programming paradigms. Application of this model is then discussed in the framework of several on-going supercomputing programs, particularly the HTMT petaflops project.

[1]  Katherine Yelick,et al.  A Case for Intelligent DRAM: IRAM , 1998 .

[2]  K. Yelick,et al.  The Energy Efficiency Of Iram Architectures , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[3]  Christoforos E. Kozyrakis,et al.  A New Direction for Computer Architecture Research , 1998, Computer.

[4]  S. Yamazaki,et al.  for Embedded DRAM , 2000 .

[5]  Eric W. Johnson,et al.  Cache-In-Memory: A Lower Power Alternative? , 1998 .

[6]  Seth Copen Goldstein,et al.  Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[7]  Vincent W. Freeh,et al.  The modify-on-access file system: An Extensible Linux File System , 1999 .

[8]  Guang R. Gao,et al.  Processing In Memory: Chips to Petaflops , 1997, ISCA 1997.

[9]  Christoforos E. Kozyrakis,et al.  A case for intelligent RAM , 1997, IEEE Micro.

[10]  Howard Jay Siegel,et al.  The PASM project: a study of reconfigurable parallel computing , 1996, Proceedings Second International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'96).

[11]  William J. Dally,et al.  The message-driven processor: a multicomputer processing node with efficient mechanisms , 1992, IEEE Micro.

[12]  David Gelernter Guest Editor's Introduction: Domesticating Parallelism , 1986, Computer.

[13]  Katherine Yelick,et al.  A Case for Intelligent RAM: IRAM , 1997 .

[14]  Maya Gokhale,et al.  Processing in Memory: The Terasys Massively Parallel PIM Array , 1995, Computer.

[15]  Guang R. Gao,et al.  Hybrid technology multithreaded architecture , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[16]  P.M. Kogge,et al.  Pursuing a petaflop: point designs for 100 TF computers using PIM technologies , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[17]  Peter M. Kogge,et al.  Prototyping Execution Models for HTMT Petaflop Machine in Java , 1999, CANPC.

[18]  Fong Pong,et al.  Missing the Memory Wall: The Case for Processor/Memory Integration , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).