Networking and Storage: The Next Computing Elements in Exascale Systems?

Many large computer clusters offer alternative computing elements in addition to general-purpose CPUs. GPU and FPGAs are very common choices. Two emerging technologies can further widen the options in that context: in-network computing (INC) and near-storage processing (NSP). These technologies support computing over data that is in transit between nodes or inside the storage stack, respectively. There are several advantages to moving computations to INC and NSP platforms. Notably, the original computation path does not need to be altered to route data through these subsystems; the network and the storage are naturally present in most computations. In this paper, we describe the evolutionary steps that led to INC and NSP platforms and discuss how they can improve critical computing paths in large-scale database systems. In the process, we comment on the constraints that the current generation of these platforms present as well as expose why we believe them to be relevant to the next generation of exascale platforms.

[1]  Fernando M. V. Ramos,et al.  Software-Defined Networking: A Comprehensive Survey , 2014, Proceedings of the IEEE.

[2]  Rino Micheloni,et al.  Inside Solid State Drives (Ssds) , 2012 .

[3]  Kenneth L. Calvert,et al.  Directions in active networks , 1998 .

[4]  Jacob Nelson,et al.  When Should The Network Be The Computer? , 2019, HotOS.

[5]  Roberto Bifulco,et al.  A Survey on the Programmable Data Plane: Abstractions, Architectures, and Open Problems , 2018, 2018 IEEE 19th International Conference on High Performance Switching and Routing (HPSR).

[6]  Hiroki Matsutani,et al.  LaKe: The Power of In-Network Computing , 2018, 2018 International Conference on ReConFigurable Computing and FPGAs (ReConFig).

[7]  Volker Markl,et al.  The Operator Variant Selection Problem on Heterogeneous Hardware , 2015, ADMS@VLDB.

[8]  Yong Wang,et al.  SDF: software-defined flash for web-scale internet storage systems , 2014, ASPLOS.

[9]  Javier González,et al.  LightNVM: The Linux Open-Channel SSD Subsystem , 2017, FAST.

[10]  Sang-Won Lee,et al.  A survey of Flash Translation Layer , 2009, J. Syst. Archit..

[11]  Anahita Shayesteh,et al.  Performance Characterization of NVMe-over-Fabrics Storage Disaggregation , 2018, ACM Trans. Storage.

[12]  Jaeyoung Do,et al.  Programmable solid-state storage in future cloud datacenters , 2019, Commun. ACM.

[13]  Alvin Cheung,et al.  Packet Transactions: High-Level Programming for Line-Rate Switches , 2015, SIGCOMM.

[14]  Sang-Won Lee,et al.  In-storage processing of database scans and joins , 2016, Inf. Sci..

[15]  David J. DeWitt,et al.  Query processing on smart SSDs: opportunities and challenges , 2013, SIGMOD '13.

[16]  Jinyoung Lee,et al.  Biscuit: A Framework for Near-Data Processing of Big Data Workloads , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[17]  Christos Faloutsos,et al.  Active Disks for Large-Scale Data Processing , 2001, Computer.

[18]  Jeffrey Stuecheli,et al.  CAPI: A Coherent Accelerator Processor Interface , 2015, IBM J. Res. Dev..

[19]  Jason Cong,et al.  INSIDER: Designing In-Storage Computing System for Emerging High-Performance Drive , 2019, USENIX Annual Technical Conference.

[20]  Philippe Bonnet,et al.  Open-Channel SSD (What is it Good For) , 2020, CIDR.

[21]  George Varghese,et al.  Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN , 2013, SIGCOMM.

[22]  Philippe Bonnet,et al.  LSM Management on Computational Storage , 2019, DaMoN.

[23]  Jens Teubner,et al.  Data Processing on FPGAs , 2013, Proc. VLDB Endow..

[24]  Andrew W. Moore,et al.  NetFPGA SUME: Toward 100 Gbps as Research Commodity , 2014, IEEE Micro.

[25]  Sang Lyul Min,et al.  Ozone (O3): An Out-of-Order Flash Memory Controller Architecture , 2011, IEEE Transactions on Computers.

[26]  George Varghese,et al.  P4: programming protocol-independent packet processors , 2013, CCRV.

[27]  H. Peter Hofstee,et al.  In-memory database acceleration on FPGAs: a survey , 2019, The VLDB Journal.

[28]  Timothy Roscoe,et al.  Arrakis , 2014, OSDI.

[29]  Onur Mutlu,et al.  Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-State Drives , 2017, Proceedings of the IEEE.

[30]  Philippe Cudré-Mauroux,et al.  It Takes Two: Instrumenting the Interaction between In-Memory Databases and Solid-State Drives , 2020, CIDR.

[31]  Phillipp Bergmann,et al.  Pci Express System Architecture , 2016 .

[32]  Philippe Cudré-Mauroux,et al.  The Case for Network Accelerated Query Processing , 2019, CIDR.