Benchmarking, analysis, and optimization of serverless function snapshots

Serverless computing has seen rapid adoption due to its high scalability and flexible, pay-as-you-go billing model. In serverless, developers structure their services as a collection of functions, sporadically invoked by various events like clicks. High inter-arrival time variability of function invocations motivates the providers to start new function instances upon each invocation, leading to significant cold-start delays that degrade user experience. To reduce cold-start latency, the industry has turned to snapshotting, whereby an image of a fully-booted function is stored on disk, enabling a faster invocation compared to booting a function from scratch. This work introduces vHive, an open-source framework for serverless experimentation with the goal of enabling researchers to study and innovate across the entire serverless stack. Using vHive, we characterize a state-of-the-art snapshot-based serverless infrastructure, based on industry-leading Containerd orchestration framework and Firecracker hypervisor technologies. We find that the execution time of a function started from a snapshot is 95% higher, on average, than when the same function is memory-resident. We show that the high latency is attributable to frequent page faults as the function's state is brought from disk into guest memory one page at a time. Our analysis further reveals that functions access the same stable working set of pages across different invocations of the same function. By leveraging this insight, we build REAP, a light-weight software mechanism for serverless hosts that records functions' stable working set of guest memory pages and proactively prefetches it from disk into memory. Compared to baseline snapshotting, REAP slashes the cold-start delays by 3.7x, on average.

[1]  Wenke Lee,et al.  How to Make ASLR Win the Clone Wars: Runtime Re-Randomization , 2016, NDSS.

[2]  Peter Pietzuch,et al.  Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing , 2020, USENIX Annual Technical Conference.

[3]  Jon Crowcroft,et al.  Unikernels: library operating systems for the cloud , 2013, ASPLOS '13.

[4]  Ricardo Bianchini,et al.  Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider , 2020, USENIX Annual Technical Conference.

[5]  Andrea C. Arpaci-Dusseau,et al.  SOCK: Rapid Task Provisioning with Serverless-Optimized Containers , 2018, USENIX Annual Technical Conference.

[6]  Jun Zhu,et al.  Twinkle: A fast resource provisioning mechanism for internet services , 2011, 2011 Proceedings IEEE INFOCOM.

[7]  Jeongchul Kim,et al.  FunctionBench: A Suite of Workloads for Serverless Cloud Function Service , 2019, 2019 IEEE 12th International Conference on Cloud Computing (CLOUD).

[8]  Don Marti,et al.  OSv - Optimizing the Operating System for Virtual Machines , 2014, USENIX Annual Technical Conference.

[9]  Peng Wu,et al.  Replayable Execution Optimized for Page Sharing for a Managed Runtime Environment , 2019, EuroSys.

[10]  Christoforos E. Kozyrakis,et al.  Usenix Association 10th Usenix Symposium on Operating Systems Design and Implementation (osdi '12) 335 Dune: Safe User-level Access to Privileged Cpu Features , 2022 .

[11]  Irene Zhang,et al.  Optimizing VM Checkpointing for Restore Performance in VMware ESXi , 2013, USENIX Annual Technical Conference.

[12]  Michael Vrable,et al.  Scalability, fidelity, and containment in the potemkin virtual honeyfarm , 2005, SOSP '05.

[13]  Yubin Xia,et al.  Characterizing serverless platforms with serverlessbench , 2020, SoCC.

[14]  Han Dong,et al.  SEUSS: skip redundant paths to make serverless fast , 2020, EuroSys.

[15]  Kyungyong Lee,et al.  Practical Cloud Workloads for Serverless FaaS , 2019, SoCC.

[16]  Yellu Sreenivasulu,et al.  FAST TRANSPARENT MIGRATION FOR VIRTUAL MACHINES , 2014 .

[17]  Sahil Malik Azure Functions , 2019 .

[18]  Eyal de Lara,et al.  SnowFlock: rapid virtual machine cloning for cloud computing , 2009, EuroSys '09.

[19]  Christof Fetzer,et al.  DreamServer: Truly On-Demand Cloud Services , 2014, SYSTOR 2014.

[20]  Michael M. Swift,et al.  Not-So-Random Numbers in Virtualized Linux and the Whirlwind RNG , 2014, 2014 IEEE Symposium on Security and Privacy.

[21]  Florian Schmidt,et al.  My VM is Lighter (and Safer) than your Container , 2017, SOSP.

[22]  Yubin Xia,et al.  Catalyzer: Sub-millisecond Startup for Serverless Computing with Initialization-less Booting , 2020, ASPLOS.

[23]  Randall L. Hyde Overview of memory management , 1988 .

[24]  Alexandru Agache,et al.  Firecracker: Lightweight Virtualization for Serverless Applications , 2020, NSDI.

[25]  Andrea C. Arpaci-Dusseau,et al.  Serverless Computation with OpenLambda , 2016, HotCloud.

[26]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[27]  Paarijaat Aditya,et al.  SAND: Towards High-Performance Serverless Computing , 2018, USENIX Annual Technical Conference.

[28]  Christoforos E. Kozyrakis,et al.  Centralized Core-granular Scheduling for Serverless Functions , 2019, SoCC.