ChatCache: A Hierarchical Semantic Redundancy Cache System for Conversational Services at Edge
暂无分享,去创建一个
[1] Rajkumar Buyya,et al. A Taxonomy and Survey of Content Delivery Networks , 2006 .
[2] Bo Hu,et al. FoggyCache: Cross-Device Approximate Computation Reuse , 2018, MobiCom.
[3] Xin Wang,et al. Clipper: A Low-Latency Online Prediction Serving System , 2016, NSDI.
[4] Juan Enrique Ramos,et al. Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .
[5] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[6] Katherine Guo,et al. Cachier: Edge-Caching for Recognition Applications , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).
[7] A. Iyengar,et al. CHA: A Caching Framework for Home-based Voice Assistant Systems , 2020, 2020 IEEE/ACM Symposium on Edge Computing (SEC).
[8] Kevin Duh,et al. Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning , 2020, RepL4NLP@ACL.
[9] Similarity Caching: Theory and Algorithms , 2019, IEEE INFOCOM 2020 - IEEE Conference on Computer Communications.
[10] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[11] Frank Bentley,et al. Understanding the Long-Term Use of Smart Speaker Assistants , 2018, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..
[12] Pawan Kumar,et al. Improve performance of machine translation service using memcached , 2017, 2017 17th International Conference on Computational Science and Its Applications (ICCSA).
[13] M. Newman. Power laws, Pareto distributions and Zipf's law , 2005 .
[14] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[15] Jonathan Foote,et al. Content-based retrieval of music and audio , 1997, Other Conferences.
[16] Yoshua Bengio,et al. Speech Model Pre-training for End-to-End Spoken Language Understanding , 2019, INTERSPEECH.
[17] Chuang Gan,et al. Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.
[18] Wei Gao,et al. MUVR: Supporting Multi-User Mobile Virtual Reality with Resource Constrained Edge Cloud , 2018, 2018 IEEE/ACM Symposium on Edge Computing (SEC).
[19] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.
[20] Thrasyvoulos Spyropoulos,et al. Femto-Caching with Soft Cache Hits: Improving Performance with Related Content Recommendation , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.
[21] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[22] Walid Saad,et al. Proactive edge computing in latency-constrained fog networks , 2017, 2017 European Conference on Networks and Communications (EuCNC).
[23] Salvatore Orlando,et al. A metric cache for similarity search , 2008, LSDS-IR '08.
[24] Moshe Wasserblat,et al. Q8BERT: Quantized 8Bit BERT , 2019, 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS).
[25] Song Han,et al. HAT: Hardware-Aware Transformers for Efficient Natural Language Processing , 2020, ACL.
[26] Pieter Hintjens,et al. ZeroMQ: Messaging for Many Applications , 2013 .
[27] Jason Weston,et al. Memory Networks , 2014, ICLR.
[28] Fabrizio Lillo,et al. Estimating the Total Volume of Queries to Google , 2019, WWW.
[29] Bowen Zhou,et al. Applying deep learning to answer selection: A study and an open task , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[30] Tao Zhang,et al. A Survey of Model Compression and Acceleration for Deep Neural Networks , 2017, ArXiv.
[31] Thrasyvoulos Spyropoulos,et al. Soft Cache Hits: Improving Performance Through Recommendation and Delivery of Related Content , 2018, IEEE Journal on Selected Areas in Communications.
[32] Yury A. Malkov,et al. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[33] Yongming Huang,et al. Proactive Caching for Vehicular Multi-View 3D Video Streaming via Deep Reinforcement Learning , 2019, IEEE Transactions on Wireless Communications.