FlashEmbedding: storing embedding tables in SSD for large-scale recommender systems
暂无分享,去创建一个
Tei-Wei Kuo | Chia-Lin Yang | Chun Jason Xue | Yufei Cui | Hu Wan | Xuan Sun | Tei-Wei Kuo | C. Xue | Chia-Lin Yang | Yufei Cui | Hu Wan | Xuan Sun
[1] Minsub Kim,et al. Reducing tail latency of DNN-based recommender systems using in-storage processing , 2020, APSys.
[2] Martin D. Schatz,et al. Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications , 2018, ArXiv.
[3] Dik Lun Lee,et al. Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba , 2018, KDD.
[4] Developing a Recommendation Benchmark for MLPerf Training and Inference , 2020, ArXiv.
[5] J. Ian Munro,et al. Robin hood hashing , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).
[6] Carole-Jean Wu,et al. RecSSD: near data processing for solid state drive based recommendation inference , 2021, ASPLOS.
[7] Jian Huang,et al. FlatFlash: Exploiting the Byte-Accessibility of SSDs within a Unified Memory-Storage Hierarchy , 2019, ASPLOS.
[8] Carole-Jean Wu,et al. DeepRecSys: A System for Optimizing End-To-End At-Scale Neural Recommendation Inference , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[9] Joo Young Hwang,et al. 2B-SSD: The Case for Dual, Byte- and Block-Addressable Solid-State Drives , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[10] Bor-Yiing Su,et al. Deep Learning Training in Facebook Data Centers: Design of Scale-up and Scale-out Systems , 2020, ArXiv.
[11] Jason Cong,et al. INSIDER: Designing In-Storage Computing System for Emerging High-Performance Drive , 2019, USENIX Annual Technical Conference.
[12] David M. Brooks,et al. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[13] Carole-Jean Wu,et al. The Architectural Implications of Facebook's DNN-Based Personalized Recommendation , 2019, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[14] Sachin Katti,et al. Bandana: Using Non-volatile Memory for Storing Deep Learning Models , 2018, MLSys.
[15] Carole-Jean Wu,et al. Cross-Stack Workload Characterization of Deep Recommendation Systems , 2020, 2020 IEEE International Symposium on Workload Characterization (IISWC).
[16] Minsoo Rhu,et al. Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[17] Minsoo Rhu,et al. TensorDIMM: A Practical Near-Memory Processing Architecture for Embeddings and Tensor Operations in Deep Learning , 2019, MICRO.
[18] Hsu Cynthia,et al. 13.5 A 128Gb 1b/Cell 96-Word-Line-Layer 3D Flash Memory to Improve Random Read Latency with t PROG =75μs and t R =4μs , 2020 .
[19] Tae Jun Ham,et al. MERCI: efficient embedding reduction on commodity hardware via sub-query memoization , 2021, ASPLOS.
[20] Martin D. Schatz,et al. RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing , 2019, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).