Profile-based Detection of Layered Bottlenecks

Detection of software bottlenecks which hinder utilizing hardware resources is a classic but complex problem due to the layered structures of the software bottlenecks. However, model-based approaches require a performance model given, which is impractical to maintain under today's agile development environment, and profile-based approaches do not handle the layered structures of the software bottlenecks. This paper proposes a novel approach of taking the best of both worlds which extracts a performance model from execution profiles of the target application to detect the layered bottlenecks. We collect a wake-up profile of threads, which samples an event that one thread wakes up another thread, and build a thread dependency graph to detect the layered bottlenecks. We implement our approach of profile-based detection of layered bottlenecks in the Go programming language. We demonstrate that our method can detect software bottlenecks limiting scalability and throughput of state-of-the-art middleware such as a web application server and a permissioned blockchain network, with small amount of the runtime overhead for profile collection.

[1]  Roy Fielding,et al.  Architectural Styles and the Design of Network-based Software Architectures"; Doctoral dissertation , 2000 .

[2]  Miao Wang,et al.  Software contention aware queueing network model of three-tier web systems , 2014, ICPE.

[3]  T. Overton 1972 , 1972, Parables of Sun Light.

[4]  C. Murray Woodside,et al.  Performance analysis of distributed server systems , 2000 .

[5]  Stijn Eyerman,et al.  Speedup stacks: Identifying scaling bottlenecks in multi-threaded applications , 2012, 2012 IEEE International Symposium on Performance Analysis of Systems & Software.

[6]  Yu Luo,et al.  Non-Intrusive Performance Profiling for Entire Software Stacks Based on the Flow Reconstruction Principle , 2016, OSDI.

[7]  Emery D. Berger,et al.  Coz: finding code that counts with causal profiling , 2015, USENIX Annual Technical Conference.

[8]  Roy T. Fielding,et al.  Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing , 2014, RFC.

[9]  Shikharesh Majumdar,et al.  Software Bootlenecking in Client-Server Systems and Rendezvous Networks , 1995, IEEE Trans. Software Eng..

[10]  Erik R. Altman,et al.  Performance analysis of idle programs , 2010, OOPSLA.

[11]  Olivia Das,et al.  Web Application Performance Modeling Using Layered Queueing Networks , 2011, PASM@ICPE.

[12]  Stijn Eyerman,et al.  Bottle graphs: visualizing scalability bottlenecks in multi-threaded applications , 2013, OOPSLA.

[13]  José Merseguer,et al.  Transformation challenges: from software models to performance models , 2014, Software & Systems Modeling.

[14]  Jing Xu,et al.  Layered Bottlenecks and Their Mitigation , 2006, Third International Conference on the Quantitative Evaluation of Systems - (QEST'06).

[15]  Marko Vukolic,et al.  Hyperledger fabric: a distributed operating system for permissioned blockchains , 2018, EuroSys.

[16]  William C. Lynch,et al.  Operating system performance , 1972, CACM.

[17]  Daniel A. Menascé,et al.  Two-level iterative queuing modeling of software contention , 2002, Proceedings. 10th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems.

[18]  Marco Ajmone Marsan,et al.  Modelling with Generalized Stochastic Petri Nets , 1995, PERV.

[19]  James R. Larus,et al.  Exploiting hardware performance counters with flow and context sensitive profiling , 1997, PLDI '97.

[20]  Balaji Viswanathan,et al.  Performance Benchmarking and Optimizing Hyperledger Fabric Blockchain Platform , 2018, 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).

[21]  Jeffrey H. Meyerson,et al.  The Go Programming Language , 2014, IEEE Softw..

[22]  Brian W. Kernighan,et al.  The Go Programming Language , 2015 .

[23]  Yuriy Brun,et al.  Mining precise performance-aware behavioral models from existing instrumentation , 2014, ICSE Companion.

[24]  Yang Wang,et al.  wPerf: Generic Off-CPU Analysis to Identify Bottleneck Waiting Events , 2018, OSDI.

[25]  Nikolai Joukov,et al.  Operating system profiling via latency analysis , 2006, OSDI '06.

[26]  Atsushi Santo,et al.  Applicability of Distributed Ledger Technology to Capital Market Infrastructure , 2016 .

[27]  Moriyoshi Ohara,et al.  Performance competitiveness of a statically compiled language for server-side Web applications , 2017, 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).