Practical Near-Data Processing to Evolve Memory and Storage Devices into Mainstream Heterogeneous Computing Systems

The capacity of memory and storage devices is expected to increase drastically with adoption of the forthcoming memory and integration technologies. This is a welcome improvement especially for datacenter servers running modern data-intensive applications. Nonetheless, for such servers to fully benefit from the increasing capacity, the bandwidth of interconnects between processors and these devices must also increase proportionally, which becomes ever costlier under unabating physical constraints. As a promising alternative to tackle this challenge cost-effectively, a heterogeneous computing paradigm referred to as near-data processing (NDP) has emerged. However, NDP has not yet been widely adopted by the industry because of significant gaps between existing software stacks and demanded ones for NDP-capable memory and storage devices. Aiming to overcome the gaps, we propose to turn memory and storage devices into familiar heterogeneous distributed computing systems. Then, we demonstrate potentials of such computing systems for existing data-intensive applications with two recently implemented NDP-capable devices. Finally, we conclude with a practical blueprint to exploit the NDP-based computing systems for speeding up solving future computer-aided design and optimization problems.

[1]  Kiyoung Choi,et al.  PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[2]  Shekhar Borkar,et al.  Role of Interconnects in the Future of Computing , 2013, Journal of Lightwave Technology.

[3]  Ken Mai,et al.  The future of wires , 2001, Proc. IEEE.

[4]  Daniel J. Abadi,et al.  Integrating compression and execution in column-oriented database systems , 2006, SIGMOD Conference.

[5]  Christoforos E. Kozyrakis,et al.  Practical Near-Data Processing for In-Memory Analytics Frameworks , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).

[6]  Chun Chen,et al.  The architecture of the DIVA processing-in-memory chip , 2002, ICS '02.

[7]  Jung Ho Ahn,et al.  Chameleon: Versatile and practical near-DRAM acceleration architecture for large memory systems , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[8]  Maysam Lavasani Generating irregular data-stream accelerators : methodology and applications , 2015 .

[9]  Maya Gokhale,et al.  Near memory key/value lookup acceleration , 2017, MEMSYS.

[10]  William J. Dally,et al.  GPUs and the Future of Parallel Computing , 2011, IEEE Micro.

[11]  Jinjun Xiong,et al.  Application-Transparent Near-Memory Processing Architecture with Memory Channel Network , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Mike Ignatowski,et al.  TOP-PIM: throughput-oriented programmable processing in memory , 2014, HPDC '14.

[13]  Ronald G. Dreslinski,et al.  Integrated 3D-stacked server designs for increasing physical density of key-value stores , 2014, ASPLOS.

[14]  Kiyoung Choi,et al.  A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[15]  Jung Ho Ahn,et al.  Near-DRAM Acceleration with Single-ISA Heterogeneous Processing in Standard Memory Modules , 2016, IEEE Micro.

[16]  Duncan G. Elliott,et al.  Computational Ram: A Memory-simd Hybrid And Its Application To Dsp , 1992, 1992 Proceedings of the IEEE Custom Integrated Circuits Conference.

[17]  Sangyeun Cho,et al.  YourSQL: A High-Performance Database System Leveraging In-Storage Computing , 2016, Proc. VLDB Endow..