Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider

Function as a Service (FaaS) has been gaining popularity as a way to deploy computations to serverless backends in the cloud. This paradigm shifts the complexity of allocating and provisioning resources to the cloud provider, which has to provide the illusion of always-available resources (i.e., fast function invocations without cold starts) at the lowest possible resource cost. Doing so requires the provider to deeply understand the characteristics of the FaaS workload. Unfortunately, there has been little to no public information on these characteristics. Thus, in this paper, we first characterize the entire production FaaS workload of Azure Functions. We show for example that most functions are invoked very infrequently, but there is an 8-order-of-magnitude range of invocation frequencies. Using observations from our characterization, we then propose a practical resource management policy that significantly reduces the number of function coldstarts,while spending fewerresources than state-of-the-practice policies.

[1]  R. Gallager Stochastic Processes , 2014 .

[2]  Ricardo Bianchini,et al.  Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms , 2017, SOSP.

[3]  Sonika Jindal,et al.  EMARS: Efficient Management and Allocation of Resources in Serverless , 2018, 2018 IEEE 11th International Conference on Cloud Computing (CLOUD).

[4]  Cristina L. Abad,et al.  A SPEC RG Cloud Group's Vision on the Performance Challenges of FaaS Cloud Architectures , 2018, ICPE Companion.

[5]  George Kesidis Overbooking Microservices in the Cloud , 2019 .

[6]  Siti Mariyam Shamsuddin,et al.  A Survey of Web Caching and Prefetching , 2011 .

[7]  Jörn Kuhlenkamp,et al.  Benchmarking elasticity of FaaS platforms as a foundation for objective-driven design of serverless applications , 2020, SAC.

[8]  Peng Wu,et al.  Replayable Execution Optimized for Page Sharing for a Managed Runtime Environment , 2019, EuroSys.

[9]  Shrideep Pallickara,et al.  Serverless Computing: An Investigation of Factors Influencing Microservice Performance , 2018, 2018 IEEE International Conference on Cloud Engineering (IC2E).

[10]  Maciej Malawski,et al.  Performance evaluation of heterogeneous cloud functions , 2018, Concurr. Comput. Pract. Exp..

[11]  Paarijaat Aditya,et al.  SAND: Towards High-Performance Serverless Computing , 2018, USENIX Annual Technical Conference.

[12]  Javad Ghaderi,et al.  Adaptive TTL-Based Caching for Content Delivery , 2017, SIGMETRICS.

[13]  Sandy Irani,et al.  Cost-Aware WWW Proxy Caching Algorithms , 1997, USENIX Symposium on Internet Technologies and Systems.

[14]  Christoforos E. Kozyrakis,et al.  Centralized Core-granular Scheduling for Serverless Functions , 2019, SoCC.

[15]  Zahir Tari,et al.  A Dynamic Resource Controller for a Lambda Architecture , 2017, 2017 46th International Conference on Parallel Processing (ICPP).

[16]  Andrea C. Arpaci-Dusseau,et al.  SOCK: Rapid Task Provisioning with Serverless-Optimized Containers , 2018, USENIX Annual Technical Conference.

[17]  Florin Ciucu,et al.  Exact analysis of TTL cache networks , 2014, Perform. Evaluation.

[18]  Marwan Krunz,et al.  An overview of web caching replacement algorithms , 2004, IEEE Communications Surveys & Tutorials.

[19]  Donald F. Towsley,et al.  A utility optimization approach to network cache design , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[20]  David Wentzlaff,et al.  Architectural Implications of Function-as-a-Service Computing , 2019, MICRO.

[21]  Kshitij Doshi,et al.  Agile Cold Starts for Scalable Serverless , 2019, HotCloud.

[22]  Zahir Tari,et al.  A Model Predictive Controller for Managing QoS Enforcements and Microarchitecture-Level Interferences in a Lambda Platform , 2018, IEEE Transactions on Parallel and Distributed Systems.

[23]  Josef Spillner,et al.  Quantitative Analysis of Cloud Function Evolution in the AWS Serverless Application Repository , 2019, ArXiv.

[24]  Fernando Paganini,et al.  Optimizing TTL Caches under Heavy-Tailed Demands , 2016, SIGMETRICS.

[25]  Mengyuan Li,et al.  Peeking Behind the Curtains of Serverless Platforms , 2018, USENIX Annual Technical Conference.

[26]  Dam Sunwoo,et al.  Temporal Prefetching Without the Off-Chip Metadata , 2019, MICRO.

[27]  Paul R. Brenner,et al.  Serverless Computing: Design, Implementation, and Performance , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems Workshops (ICDCSW).

[28]  B. Welford Note on a Method for Calculating Corrected Sums of Squares and Products , 1962 .

[29]  Vasilios Andrikopoulos,et al.  Using a Microbenchmark to Compare Function as a Service Solutions , 2018, ESOCC.

[30]  Geoffrey C. Fox,et al.  Evaluation of Production Serverless Computing Environments , 2018, 2018 IEEE 11th International Conference on Cloud Computing (CLOUD).

[31]  G. Box,et al.  Distribution of Residual Autocorrelations in Autoregressive-Integrated Moving Average Time Series Models , 1970 .

[32]  Andrea C. Arpaci-Dusseau,et al.  Serverless Computation with OpenLambda , 2016, HotCloud.

[33]  László Böszörményi,et al.  A survey of Web cache replacement strategies , 2003, CSUR.

[34]  Sahil Malik Azure Functions , 2019 .