AutoShard: Automated Embedding Table Sharding for Recommender Systems

Embedding learning is an important technique in deep recommendation models to map categorical features to dense vectors. However, the embedding tables often demand an extremely large number of parameters, which become the storage and efficiency bottlenecks. Distributed training solutions have been adopted to partition the embedding tables into multiple devices. However, the embedding tables can easily lead to imbalances if not carefully partitioned. This is a significant design challenge of distributed systems named embedding table sharding, i.e., how we should partition the embedding tables to balance the costs across devices, which is a non-trivial task because 1) it is hard to efficiently and precisely measure the cost, and 2) the partition problem is known to be NP-hard. In this work, we introduce our novel practice in Meta, namely AutoShard, which uses a neural cost model to directly predict the multi-table costs and leverages deep reinforcement learning to solve the partition problem. Experimental results on an open-sourced large-scale synthetic dataset and Meta's production dataset demonstrate the superiority of AutoShard over the heuristics. Moreover, the learned policy of AutoShard can transfer to sharding tasks with various numbers of tables and different ratios of the unseen tables without any fine-tuning. Furthermore, AutoShard can efficiently shard hundreds of tables in seconds. The effectiveness, transferability, and efficiency of AutoShard make it desirable for production use. Our algorithms have been deployed in Meta production environment. A prototype is available at https://github.com/daochenzha/autoshard

[1]  Christos Kozyrakis,et al.  RecShard: statistical feature-based memory optimization for industry-scale neural recommendation , 2022, ASPLOS.

[2]  D. Zha,et al.  Automated Anomaly Detection via Curiosity-Guided Search and Self-Imitation Learning , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Zaid Pervaiz Bhat,et al.  AutoVideo: An Automated Video Action Recognition System , 2021, IJCAI.

[4]  Xiangru Lian,et al.  DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning , 2021, ICML.

[5]  Xia Hu,et al.  Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments , 2021, ICLR.

[6]  Depeng Jin,et al.  Learnable Embedding Sizes for Recommender Systems , 2021, ICLR.

[7]  Mikhail Smelyanskiy,et al.  FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference , 2021, ArXiv.

[8]  Carole-Jean Wu,et al.  Understanding Training Efficiency of Deep Learning Recommendation Models at Scale , 2020, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).

[9]  Ed H. Chi,et al.  Learning to Embed Categorical Features without Embedding Tables for Recommendation , 2020, KDD.

[10]  Diego Martinez,et al.  TODS: An Automated Time Series Outlier Detection System , 2020, AAAI.

[11]  Xia Hu,et al.  Meta-AAD: Active Anomaly Detection with Deep Reinforcement Learning , 2020, 2020 IEEE International Conference on Data Mining (ICDM).

[12]  Alykhan Tejani,et al.  Model Size Reduction Using Frequency Based Double Hashing for Recommender Systems , 2020, RecSys.

[13]  Xia Hu,et al.  RLCard: A Platform for Reinforcement Learning in Card Games , 2020, IJCAI.

[14]  Bor-Yiing Su,et al.  Deep Learning Training in Facebook Data Centers: Design of Scale-up and Scale-out Systems , 2020, ArXiv.

[15]  Ping Li,et al.  Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems , 2020, MLSys.

[16]  Jiliang Tang,et al.  AutoEmb: Automated Embedding Dimensionality Search in Streaming Recommendations , 2020, ArXiv.

[17]  Dong Lin,et al.  Learning Multi-granular Quantized Embeddings for Large-Vocab Categorical Features in Recommender Systems , 2020, WWW.

[18]  Edward Grefenstette,et al.  TorchBeast: A PyTorch Platform for Distributed RL , 2019, ArXiv.

[19]  Jakob N. Foerster,et al.  Exploratory Combinatorial Optimization with Reinforcement Learning , 2019, AAAI.

[20]  Jiyan Yang,et al.  Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems , 2019, KDD.

[21]  Peng Sun,et al.  XDL: an industrial deep learning framework for high-dimensional sparse data , 2019 .

[22]  Quoc V. Le,et al.  Neural Input Search for Large Scale Recommendation Models , 2019, KDD.

[23]  Daochen Zha,et al.  Experience Replay Optimization , 2019, IJCAI.

[24]  Carole-Jean Wu,et al.  The Architectural Implications of Facebook's DNN-Based Personalized Recommendation , 2019, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[25]  Yinghai Lu,et al.  Deep Learning Recommendation Model for Personalization and Recommendation Systems , 2019, ArXiv.

[26]  Rui Zhang,et al.  Security and Privacy on Blockchain , 2019, ACM Comput. Surv..

[27]  Ameet Talwalkar,et al.  Random Search and Reproducibility for Neural Architecture Search , 2019, UAI.

[28]  Jian Tang,et al.  AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks , 2018, CIKM.

[29]  D. Jeong,et al.  Nonvolatile Memory Materials for Neuromorphic Intelligent Machines , 2018, Advanced materials.

[30]  Chang Zhou,et al.  Deep Interest Evolution Network for Click-Through Rate Prediction , 2018, AAAI.

[31]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[32]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[33]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[34]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[35]  Stephanie Rogers,et al.  Related Pins at Pinterest: The Evolution of a Real-World Recommender System , 2017, WWW.

[36]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[37]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[38]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[39]  CARLOS A. GOMEZ-URIBE,et al.  The Netflix Recommender System , 2015, ACM Trans. Manag. Inf. Syst..

[40]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[41]  Yi Tay,et al.  Deep Learning based Recommender System: A Survey and New Perspectives , 2017, ArXiv.