Reducing cost and tolerating defects in page-based intelligent memory

Active Pages is a page-based model of intelligent memory specifically designed to support virtualized hardware resources. Previous work has shown substantial performance benefits from off loading data-intensive tasks to a memory system that implements Active Pages. With a simple VLIW processor embedded near each page on DRAM, Active Page memory systems achieve up to 1000X speedups over conventional memory systems. In this study, we examine Active Page memories that share, or multiplex, embedded VLIW processors across multiple physical Active Pages. We explore the trade-off between individual page-processor performance and page-level multiplexing. We find that hardware costs of computational logic can be reduced from 31% of DRAM chip area to 12%, through multiplexing, without significant loss in performance. Furthermore, manufacturing defects that disable up to 50% of the page processors can be tolerated through efficient resource allocation and associative multiplexing.

[1]  Masato Motomura,et al.  An Embedded DRAM-FPGA Chip With Instantaneous Logic Reconfiguration , 1997, Symposium 1997 on VLSI Circuits.

[2]  William J. Dally,et al.  The J-machine Multicomputer: An Architectural Evaluation , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[3]  Erik Brunvand,et al.  Impulse: building a smarter memory controller , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[4]  David R. Cheriton,et al.  Application-controlled physical memory using external page-cache management , 1992, ASPLOS V.

[5]  Christoforos E. Kozyrakis,et al.  A case for intelligent RAM , 1997, IEEE Micro.

[6]  Jaewook Shin,et al.  Mapping Irregular Applications to DIVA, a PIM-based Data-Intensive Architecture , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[7]  Vivek Sarkar,et al.  Space-time scheduling of instruction-level parallelism on a raw machine , 1998, ASPLOS VIII.

[8]  Josep Torrellas,et al.  Toward a cost-effective DSM organization that exploits processor-memory integration , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).

[9]  Frederic T. Chong,et al.  ActiveOS: virtualizing intelligent memory , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[10]  R.K. Gupta,et al.  MORPH: a system architecture for robust high performance using customization (an NSF 100 TeraOps point design study) , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[11]  K. Yelick,et al.  The Energy Efficiency Of Iram Architectures , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[12]  Jianwen Zhu,et al.  Specification and Design of Embedded Systems , 1998, Informationstechnik Tech. Inform..

[13]  R. Eigenmann,et al.  Hierarchical processors-and-memory architecture for high performance computing , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[14]  M. Oskin,et al.  Active Pages: a computation model for intelligent memory , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[15]  K. Murakami,et al.  Parallel processing RAM chip with 256 Mb DRAM and quad processors , 1997, 1997 IEEE International Solids-State Circuits Conference. Digest of Technical Papers.

[16]  Seung-Moon Yoo,et al.  FlexRAM: toward an advanced intelligent memory system , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[17]  Katherine Yelick,et al.  A Case for Intelligent RAM: IRAM , 1997 .

[18]  Hai Jin,et al.  Active Disks: Programming Model, Algorithms and Evaluation , 2002 .

[19]  Monica S. Lam,et al.  Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..

[20]  Abhaya Asthana,et al.  Design of an active memory system for network applications , 1994, Proceedings of IEEE International Workshop on Memory Technology, Design, and Test.

[21]  Jim Zelenka,et al.  A cost-effective, high-bandwidth storage architecture , 1998, ASPLOS VIII.

[22]  Maya Gokhale,et al.  Processing in Memory: The Terasys Massively Parallel PIM Array , 1995, Computer.

[23]  Carl Ebeling,et al.  Mapping applications to the RaPiD configurable architecture , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[24]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[25]  Kiyoo Itoh,et al.  Limitations and challenges of multigigabit DRAM chip design , 1997, IEEE J. Solid State Circuits.

[26]  Frederic T. Chong,et al.  Exploiting ILP in page-based intelligent memory , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.