Pursuing a petaflop: point designs for 100 TF computers using PIM technologies

This paper is a summary of a proposal submitted to the NSF 100 Tera Flops Point Design Study. Its main thesis is that the use of Processing-In-Memory (PIM) technology can provide an extremely dense and highly efficient base on which such computing systems can be constructed the paper describes a strawman organization of one potential PIM chip, along with how multiple such chips might be organized into a real system, what the software supporting such a system might look like, and several applications which we will be attempting to place onto such a system.

[1]  Edwin Hsing-Mean Sha,et al.  Full Parallelism in Uniform Nested Loops Using Multi-Dimensional Retiming , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[2]  John E. Renaud,et al.  A methodology for concurrent fabrication process/cell library optimization , 1996, DAC '96.

[3]  Robert E. Tarjan,et al.  Making data structures persistent , 1986, STOC '86.

[4]  Thomas Sterling,et al.  Enabling Technologies for Petaflops Computing , 1995 .

[5]  John E. Renaud,et al.  Design flow management and multidisciplinary design optimization in application to aircraft concept sizing , 1996 .

[6]  A. Fettweis Wave digital filters: Theory and practice , 1986, Proceedings of the IEEE.

[7]  Peter M. Kogge,et al.  EXECUBE-A New Architecture for Scaleable MPPs , 1994, 1994 International Conference on Parallel Processing Vol. 1.

[8]  Edwin Hsing-Mean Sha,et al.  Optimal Data Scheduling for Uniform Multidimensional Applications , 1996, IEEE Trans. Computers.

[9]  John E. Renaud,et al.  Concurrent Subspace Optimization Using Design Variable Sharing in a Distributed Computing Environment , 1996 .

[10]  Edwin Hsing-Mean Sha,et al.  Architecture-Dependent Loop Scheduling via Communication-Sensitive Remapping , 1995, ICPP.

[11]  Edwin Hsing-Mean Sha,et al.  Retiming and Unfolding Data-Flow Graphs , 1992, ICPP.

[12]  Peter M. Kogge,et al.  Combined DRAM and logic chip for massively parallel systems , 1995, Proceedings Sixteenth Conference on Advanced Research in VLSI.

[13]  Robert E. Tarjan,et al.  Planar point location using persistent search trees , 1986, CACM.

[14]  ALFRED FETTWEIS,et al.  Numerical integration of partial differential equations using principles of multidimensional wave digital filters , 1991, J. VLSI Signal Process..

[15]  Alfred Fettweis,et al.  Transformation approach to numerically integrating PDEs by means of WDF principles , 1991, Multidimens. Syst. Signal Process..