Bandana: Using Non-volatile Memory for Storing Deep Learning Models

Typical large-scale recommender systems use deep learning models that are stored on a large amount of DRAM. These models often rely on embeddings, which consume most of the required memory. We present Bandana, a storage system that reduces the DRAM footprint of embeddings, by using Non-volatile Memory (NVM) as the primary storage medium, with a small amount of DRAM as cache. The main challenge in storing embeddings on NVM is its limited read bandwidth compared to DRAM. Bandana uses two primary techniques to address this limitation: first, it stores embedding vectors that are likely to be read together in the same physical location, using hypergraph partitioning, and second, it decides the number of embedding vectors to cache in DRAM by simulating dozens of small caches. These techniques allow Bandana to increase the effective read bandwidth of NVM by 2-3x and thereby significantly reduce the total cost of ownership.

[1]  Irving L. Traiger,et al.  Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..

[2]  Rob H. Bisseling,et al.  Parallel hypergraph partitioning for scientific computing , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[3]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[4]  William J. Knottenbelt,et al.  Parallel multilevel algorithms for hypergraph partitioning , 2008, J. Parallel Distributed Comput..

[5]  Roy H. Campbell,et al.  Consistent and Durable Data Structures for Non-Volatile Byte-Addressable Memory , 2011, FAST.

[6]  Luis Ceze,et al.  Exploring storage class memory with key value stores , 2013, INFLOW '13.

[7]  Andrew Warfield,et al.  Characterizing Storage Workloads with Counter Stacks , 2014, OSDI.

[8]  Samir Khuller,et al.  SWORD: workload-aware data placement and replica selection for cloud data management systems , 2014, The VLDB Journal.

[9]  Sachin Katti,et al.  Dynacache: Dynamic Cloud Caching , 2015, HotStorage.

[10]  Daniel Sánchez,et al.  Talus: A simple way to remove cliffs in cache performance , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[11]  Jian Xu,et al.  NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories , 2016, FAST.

[12]  Sachin Katti,et al.  Cliffhanger: Scaling Performance Cliffs in Web Memory Caches , 2016, NSDI.

[13]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[14]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[15]  Arun Sharma,et al.  Social Hash: An Assignment Framework for Optimizing Distributed Systems Operations on Social Networks , 2016, NSDI.

[16]  Yingwei Luo,et al.  Kinetic Modeling of Data Eviction in Cache , 2016, USENIX Annual Technical Conference.

[17]  Ismail Oukid,et al.  FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory , 2016, SIGMOD Conference.

[18]  Gang Fu,et al.  Deep & Cross Network for Ad Click Predictions , 2017, ADKDD@KDD.

[19]  Irfan Ahmad,et al.  Cache Modeling and Optimization using Miniature Simulations , 2017, USENIX Annual Technical Conference.

[20]  Ryan Stutsman,et al.  Memshare: a Dynamic Multi-tenant Key-value Cache , 2017, USENIX Annual Technical Conference.

[21]  Jin Xiong,et al.  HiKV: A Hybrid Index Key-Value Store for DRAM-NVM Memory Systems , 2017, USENIX Annual Technical Conference.

[22]  Brian Karrer,et al.  Social Hash Partitioner: A Scalable Distributed Hypergraph Partitioner , 2017, Proc. VLDB Endow..

[23]  Sudipta Chattopadhyay,et al.  LAWN: Boosting the performance of NVMM File System through Reducing Write Amplification , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[24]  Sachin Katti,et al.  Reducing DRAM footprint with NVM in Facebook , 2018, EuroSys.

[25]  Guorui Zhou,et al.  Deep Interest Network for Click-Through Rate Prediction , 2017, KDD.

[26]  Jin Xiong,et al.  Caching or Not: Rethinking Virtual File System for Non-Volatile Main Memory , 2018, HotStorage.