Response Time Characterization of Microservice-Based Systems

In pursuit of faster development cycles, companies have favored small decoupled services over monoliths. Following this trend, distributed systems made of microservices have grown in scale and complexity, giving rise to a new set of operational problems. Even though this paradigm simplifies development, deployment, management of individual services, it hinders system observability. In particular, performance monitoring and analysis becomes more challenging, especially for critical production systems that have grown organically, operate continuously, and cannot afford the availability cost of online benchmarking. Additionally, these systems are often very large and expensive, thus being bad candidates for full-scale development replicas. Creating models of services and systems for characterization and formal analysis can alleviate the aforementioned issues. Since performance, namely response time, is the main interest of this work, we focused on bottleneck detection and optimal resource scheduling. We propose a method for modeling production services as queuing systems from request traces. Additionally, we provide analytical tools for response time characterization and optimal resource allocation. Our results show that a simple queuing system with a single queue and multiple homogeneous servers has a small parameter space that can be estimated in production. The resulting model can be used to accurately predict response time distribution and the necessary number of instances to maintain a desired service level, under a given load.

[1]  Zhiliang Zhu,et al.  Dynamic Provisioning Modeling for Virtualized Multi-tier Applications in Cloud Data Center , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[2]  Ciro D'Apice,et al.  Queueing Theory , 2003, Operations Research.

[3]  Jinsoo Park,et al.  Analysis of an unobservable queue using arrival and departure times , 2011, Comput. Ind. Eng..

[4]  Ashraf A. Shahin Enhancing Elasticity of SaaS Applications using Queuing Theory , 2017, ArXiv.

[5]  Jerome A. Rolia,et al.  Web Server Performance Measurement and Modeling Techniques , 1998, Performance evaluation (Print).

[6]  Maria Kihl,et al.  Web server performance modeling using an M/G/1/K*PS queue , 2003, 10th International Conference on Telecommunications, 2003. ICT 2003..

[7]  Rui Pedro Paiva,et al.  Client-side black-box monitoring for web sites , 2017, 2017 IEEE 16th International Symposium on Network Computing and Applications (NCA).

[8]  Claus Pahl,et al.  Performance Engineering for Microservices: Research Challenges and Directions , 2017, ICPE Companion.

[9]  I. Adan,et al.  QUEUEING THEORY , 1978 .

[10]  Kerrie Mengersen,et al.  Computationally Efficient Simulation of Queues: The R Package queuecomputer , 2017, J. Stat. Softw..

[11]  Haifeng Li A Queue Theory Based Response Time Model for Web Services Chain , 2010, 2010 International Conference on Computational Intelligence and Software Engineering.

[12]  Asser N. Tantawi,et al.  An analytical model for multi-tier internet services and its applications , 2005, SIGMETRICS '05.

[13]  Paramvir Bahl,et al.  Towards highly reliable enterprise network services via inference of multi-level dependencies , 2007, SIGCOMM.

[14]  Li-Chun Wang,et al.  A queueing analytical model for service mashup in mobile cloud computing , 2013, 2013 IEEE Wireless Communications and Networking Conference (WCNC).

[15]  Rodrigo Fonseca,et al.  Principled workflow-centric tracing of distributed systems , 2016, SoCC.