Performance Implications of Processing-in-Memory Designs on Data-Intensive Applications
暂无分享,去创建一个
Dong Li | Jishen Zhao | Florin Rusu | Borui Wang | Martin Torres | Jishen Zhao | Dong Li | Borui Wang | Martin Torres | Florin Rusu
[1] Tack-Don Han,et al. An effective memory-processor integrated architecture for computer vision , 1997, Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162).
[2] Zvika Guz. Real-Time Analytics as the Killer Application for Processing-In-Memory , 2014 .
[3] Peter M. Kogge,et al. EXECUBE-A New Architecture for Scaleable MPPs , 1994, 1994 International Conference on Parallel Processing Vol. 1.
[4] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[5] Tze Meng Low,et al. 3 D-Stacked Memory-Side Acceleration : Accelerator and System Design , 2014 .
[6] Franz Franchetti,et al. Data reorganization in memory using 3D-stacked DRAM , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[7] Jose Renau,et al. Programming the FlexRAM parallel intelligent memory system , 2003, PPoPP '03.
[8] Mike Ignatowski,et al. High-level Programming Model Abstractions for Processing in Memory , 2013 .
[9] Florin Rusu,et al. Scalable Analytics Model Calibration with Online Aggregation , 2015, IEEE Data Eng. Bull..
[10] Florin Rusu,et al. Scalable I/O-bound parallel incremental gradient descent for big data analytics in GLADE , 2013, DanaC '13.
[11] Kiyoung Choi,et al. A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[12] Josep Torrellas,et al. Automatic Code Mapping on an Intelligent Memory Architecture , 2001, IEEE Trans. Computers.
[13] Jaewook Shin,et al. Mapping Irregular Applications to DIVA, a PIM-based Data-Intensive Architecture , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[14] Hyesoon Kim,et al. Instruction Offloading with HMC 2.0 Standard: A Case Study for Graph Traversals , 2015, MEMSYS.
[15] Tong Wen. Introduction to the X 10 Implementation of NPB MG , 2006 .
[16] Gabriel H. Loh Nuwan Jayasena Mark H. Oskin Mark Nutter Da Ignatowski. A Processing-in-Memory Taxonomy and a Case for Studying Fixed-function PIM , 2013 .
[17] Florin Rusu,et al. Speculative Approximations for Terascale Distributed Gradient Descent Optimization , 2015, DanaC@SIGMOD.
[18] Dean M. Tullsen,et al. Data-triggered Multithreading for Near-Data Processing , 2013 .
[19] Mike Ignatowski,et al. TOP-PIM: throughput-oriented programmable processing in memory , 2014, HPDC '14.
[20] Yu Cheng,et al. GLADE: big data analytics made easy , 2012, SIGMOD Conference.
[21] Peter M. Kogge,et al. The Characterization of Data Intensive Memory Workloads on Distributed PIM Systems , 2000, Intelligent Memory Systems.
[22] Gabriel H. Loh,et al. Thermal Feasibility of Die-Stacked Processing in Memory , 2014 .
[23] Kiyoung Choi,et al. PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[24] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[25] G. Seroussi,et al. Sidestep: Co-designed shiftable memory & software , 2012 .
[26] Chun Chen,et al. The architecture of the DIVA processing-in-memory chip , 2002, ICS '02.
[27] Maya Gokhale,et al. Processing in Memory: The Terasys Massively Parallel PIM Array , 1995, Computer.
[28] Christoforos E. Kozyrakis,et al. A case for intelligent RAM , 1997, IEEE Micro.
[29] Dave Brown,et al. Supplementary Material for An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing , 2013 .
[30] Frederic T. Chong,et al. Active pages: a computation model for intelligent memory , 1998, ISCA.
[31] Sudhakar Yalamanchili,et al. SIMT-based Logic Layers for Stacked DRAM Architectures: A Prototype , 2015, MEMSYS.
[32] Florin Rusu,et al. GLADE: a scalable framework for efficient analytics , 2012, OPSR.
[33] Florin Rusu,et al. Speculative Approximations for Terascale Analytics , 2014, ArXiv.
[34] Duncan G. Elliott,et al. Computational RAM: Implementing Processors in Memory , 1999, IEEE Des. Test Comput..