Client Microservice Network 1 ) Client generates job i 2 ) Register job arrival event Event Queue 3 )

Current cloud services are moving away from monolithic designs and towards graphs of many looselycoupled, single-concerned microservices. Microservices have several advantages, including speeding up development and deployment, allowing specialization of the software infrastructure, and helping with debugging and error isolation. At the same time they introduce several hardware and software challenges. Given that most of the performance and efficiency implications of microservices happen at scales larger than what is available outside production deployments, studying such effects requires designing the right simulation infrastructures. We present μqSim, a scalable and validated queueing network simulator designed specifically for interactive microservices. μqSim provides detailed intraand inter-microservice models that allow it to faithfully reproduce the behavior of complex, many-tier applications. μqSim is also modular, allowing reuse of individual models across microservices and end-toend applications. We have validated μqSim both against simple and more complex microservices graphs, and have shown that it accurately captures performance in terms of throughput and tail latency. Finally, we use μqSim to model the tail at scale effects of request fanout, and the performance impact of power management in latency-sensitive microservices.

[1]  M. Thomas Queueing Systems. Volume 1: Theory (Leonard Kleinrock) , 1976 .

[2]  Mario Gerla,et al.  GloMoSim: a library for parallel simulation of large-scale wireless networks , 1998 .

[3]  Ripal Nathuji,et al.  Exploiting Platform Heterogeneity for Power Efficient Data Centers , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[4]  Teerawat Issariyakul,et al.  Introduction to Network Simulator NS2 , 2008 .

[5]  Thomas R. Henderson,et al.  Network Simulations with the ns-3 Simulator , 2008 .

[6]  Thomas F. Wenisch,et al.  PowerNap: eliminating server idle power , 2009, ASPLOS.

[7]  Shengli Zhou,et al.  Aqua-Sim: An NS-2 based simulator for underwater sensor networks , 2009, OCEANS 2009.

[8]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[9]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[10]  L. Barroso Warehouse-Scale Computing: Entering the Teenage Decade , 2011, SIGARCH Comput. Archit. News.

[11]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Junjie Wu,et al.  BigHouse: A simulation infrastructure for data center systems , 2012, 2012 IEEE International Symposium on Performance Analysis of Systems & Software.

[13]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[14]  Lingjia Tang,et al.  Whare-map: heterogeneity in "homogeneous" warehouse-scale computers , 2013, ISCA.

[15]  Christina Delimitrou,et al.  QoS-Aware scheduling in heterogeneous datacenters with paragon , 2013, TOCS.

[16]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[17]  Lingjia Tang,et al.  Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.

[18]  Mor Harchol-Balter Performance Modeling and Design of Computer Systems: The M/G/1 Queue and the Inspection Paradox , 2013 .

[19]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[20]  Christoforos E. Kozyrakis,et al.  Towards energy proportionality for large-scale latency-critical workloads , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[21]  Christina Delimitrou,et al.  Tarcil: reconciling scheduling speed and quality in large shared clusters , 2015, SoCC.

[22]  Christoforos E. Kozyrakis,et al.  Heracles: Improving resource efficiency at scale , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[23]  Ronald G. Dreslinski,et al.  Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[24]  Takuya Nakaike,et al.  Workload characterization for microservices , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).

[25]  Christina Delimitrou,et al.  Workload characterization of interactive cloud services on big and small server platforms , 2017, 2017 IEEE International Symposium on Workload Characterization (IISWC).

[26]  Christina Delimitrou,et al.  The Architectural Implications of Cloud Microservices , 2018, IEEE Computer Architecture Letters.

[27]  Christina Delimitrou,et al.  Seer : Leveraging Big Data to Navigate The Complexity of Cloud Debugging , 2018 .

[28]  Thomas F. Wenisch,et al.  μ Suite: A Benchmark Suite for Microservices , 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC).

[29]  Jun Sun,et al.  Poster: Benchmarking Microservice Systems for Software Engineering Research , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion).

[30]  Yuan He,et al.  Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices , 2019, ASPLOS.

[31]  Yuan He,et al.  An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems , 2019, ASPLOS.