InfiniCache: Exploiting Ephemeral Serverless Functions to Build a Cost-Effective Memory Cache

Internet-scale web applications are becoming increasingly storage-intensive and rely heavily on in-memory object caching to attain required I/O performance. We argue that the emerging serverless computing paradigm provides a well-suited, cost-effective platform for object caching. We present InfiniCache, a first-of-its-kind in-memory object caching system that is completely built and deployed atop ephemeral serverless functions. InfiniCache exploits and orchestrates serverless functions' memory resources to enable elastic pay-per-use caching. InfiniCache's design combines erasure coding, intelligent billed duration control, and an efficient data backup mechanism to maximize data availability and cost-effectiveness while balancing the risk of losing cached state and performance. We implement InfiniCache on AWS Lambda and show that it: (1) achieves 31 -- 96X tenant-side cost savings compared to AWS ElastiCache for a large-object-only production workload, (2) can effectively provide 95.4% data availability for each one hour window, and (3) enables comparative performance seen in a typical in-memory cache.

[1]  Christoforos E. Kozyrakis,et al.  From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers , 2019, USENIX Annual Technical Conference.

[2]  Mengyuan Li,et al.  Peeking Behind the Curtains of Serverless Platforms , 2018, USENIX Annual Technical Conference.

[3]  Ali Raza Butt,et al.  An in-memory object caching framework with adaptive load balancing , 2015, EuroSys.

[4]  Marc Sánchez Artigas,et al.  On the FaaS Track: Building Stateful Distributed Applications with Serverless Architectures , 2019, Middleware.

[5]  Ethan Katz-Bassett,et al.  SPANStore: cost-effective geo-replicated storage spanning multiple cloud services , 2013, SOSP.

[6]  Karl Aberer,et al.  Scalia: An adaptive scheme for efficient multi-cloud storage , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[7]  Scott Shenker,et al.  Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks , 2014, SoCC.

[8]  Stephen M. Rumble,et al.  Log-structured memory for DRAM-based storage , 2014, FAST.

[9]  Srikanth Kandula,et al.  PACMan: Coordinated Memory Caching for Parallel Jobs , 2012, NSDI.

[10]  Vatche Ishakian,et al.  Serving Deep Learning Models in a Serverless Platform , 2017, 2018 IEEE International Conference on Cloud Engineering (IC2E).

[11]  Enhong Chen,et al.  KV-Direct: High-Performance In-Memory Key-Value Store with Programmable NIC , 2017, SOSP.

[12]  Yuan He,et al.  An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems , 2019, ASPLOS.

[13]  Eddie Kohler,et al.  Cache craftiness for fast multicore key-value storage , 2012, EuroSys '12.

[14]  David A. Patterson,et al.  Cloud Programming Simplified: A Berkeley View on Serverless Computing , 2019, ArXiv.

[15]  Geoffrey M. Voelker,et al.  Sprocket: A Serverless Video Processing Framework , 2018, SoCC.

[16]  Ion Stoica,et al.  Shuffling, Fast and Slow: Scalable Analytics on Serverless Infrastructure , 2019, NSDI.

[17]  Murali S. Kodialam,et al.  Frugal storage for cloud file systems , 2012, EuroSys '12.

[18]  Hyeontaek Lim,et al.  MICA: A Holistic Approach to Fast In-Memory Key-Value Storage , 2014, NSDI.

[19]  Sanjeev Kumar,et al.  Finding a Needle in Haystack: Facebook's Photo Storage , 2010, OSDI.

[20]  Bin Fan,et al.  MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing , 2013, NSDI.

[21]  Li Zhang,et al.  zExpander: a key-value cache with both high performance and fewer misses , 2016, EuroSys.

[22]  Willy Zwaenepoel,et al.  Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores , 2018, NSDI.

[23]  Zhe Wu,et al.  CosTLO: Cost-Effective Redundancy for Lower Latency Variance on Cloud Storage Services , 2015, NSDI.

[24]  Kannan Ramchandran,et al.  EC-Cache: Load-Balanced, Low-Latency Cluster Caching with Online Erasure Coding , 2016, OSDI.

[25]  Prateek Sharma,et al.  SpotOn: a batch computing service for the spot market , 2015, SoCC.

[26]  Jingyuan Zhang,et al.  In Search of a Fast and Efficient Serverless DAG Engine , 2019, 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW).

[27]  Anirudh Sivaraman,et al.  Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads , 2017, NSDI.

[28]  Ramesh K. Sitaraman,et al.  AdaptSize: Orchestrating the Hot Object Memory Cache in a Content Delivery Network , 2017, NSDI.

[29]  Sachin Katti,et al.  Cliffhanger: Scaling Performance Cliffs in Web Memory Caches , 2016, NSDI.

[30]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[31]  Xiaozhou Li,et al.  DistCache: Provable Load Balancing for Large-Scale Storage Systems with Distributed Caching , 2019, FAST.

[32]  George Kesidis,et al.  Exploiting Spot and Burstable Instances for Improving the Cost-efficacy of In-Memory Caches on the Public Cloud , 2017, EuroSys.

[33]  Mohamed Mohamed,et al.  Improving Docker Registry Design Based on Production Workload Analysis , 2018, FAST.

[34]  Hakim Weatherspoon,et al.  RACS: a case for cloud storage diversity , 2010, SoCC '10.

[35]  Prateek Sharma,et al.  SpotCheck: designing a derivative IaaS cloud on the spot market , 2015, EuroSys.

[36]  Ali R. Butt,et al.  Bolt: Towards a Scalable Docker Registry via Hyperconvergence , 2019, 2019 IEEE 12th International Conference on Cloud Computing (CLOUD).

[37]  Ion Stoica,et al.  Occupy the cloud: distributed computing for the 99% , 2017, SoCC.

[38]  Paarijaat Aditya,et al.  SAND: Towards High-Performance Serverless Computing , 2018, USENIX Annual Technical Conference.

[39]  Randy H. Katz,et al.  Cirrus: a Serverless Framework for End-to-end ML Workflows , 2019, SoCC.

[40]  Wei Wang,et al.  SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[41]  Andrea C. Arpaci-Dusseau,et al.  SOCK: Rapid Task Provisioning with Serverless-Optimized Containers , 2018, USENIX Annual Technical Conference.

[42]  Ion Stoica,et al.  Numpywren: Serverless Linear Algebra , 2018, ArXiv.