Analyzing system performance with probabilistic performance annotations

To understand, debug, and predict the performance of complex software systems, we develop the concept of probabilistic performance annotations. In essence, we annotate components (e.g., methods) with a relation between a measurable performance metric, such as running time, and one or more features of the input or the state of that component. We use two forms of regression analysis: regression trees and mixture models. Such relations can capture non-trivial behaviors beyond the more classic algorithmic complexity of a component. We present a method to derive such annotations automatically by generalizing observed measurements. We illustrate the use of our approach on three complex systems---the ownCloud distributed storage service; the MySQL database system; and the x264 video encoder library and application---producing non-trivial characterizations of the performance. Notably, we isolate a performance regression and identify the root cause of a second performance bug in MySQL.

[1]  Matthias Hauswirth,et al.  Vertical profiling: understanding the behavior of object-priented applications , 2004, OOPSLA.

[2]  Lieven Eeckhout,et al.  Performance prediction based on inherent program similarity , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[3]  Susan L. Graham,et al.  Gprof: A call graph execution profiler , 1982, SIGPLAN '82.

[4]  Amer Diwan,et al.  Discovering Algebraic Specifications from Java Classes , 2003, ECOOP.

[5]  Greg Nelson,et al.  Extended static checking for Java , 2002, PLDI '02.

[6]  Matthias Hauswirth,et al.  Algorithmic profiling , 2012, PLDI.

[7]  Matthias Hauswirth,et al.  Producing wrong data without doing anything obviously wrong! , 2009, ASPLOS.

[8]  Amer Diwan,et al.  A tool for writing and debugging algebraic specifications , 2004, Proceedings. 26th International Conference on Software Engineering.

[9]  Peter Nobel,et al.  Practical performance models for complex, popular applications , 2010, SIGMETRICS '10.

[10]  Jeffrey S. Vetter,et al.  Asserting Performance Expectations , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[11]  Sven Apel,et al.  Variability-aware performance prediction: A statistical learning approach , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[12]  Simon Goldsmith,et al.  Measuring empirical computational complexity , 2007, ESEC-FSE '07.

[13]  Camil Demetrescu,et al.  Estimating the Empirical Cost Function of Routines with Dynamic Workloads , 2014, CGO '14.

[14]  Camil Demetrescu,et al.  Input-Sensitive Profiling , 2012, IEEE Transactions on Software Engineering.

[15]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[16]  M. Hazelton Variable kernel density estimation , 2003 .

[17]  David S. Rosenblum,et al.  Mining performance specifications , 2016, SIGSOFT FSE.

[18]  Dirk P. Kroese,et al.  Kernel density estimation via diffusion , 2010, 1011.2602.

[19]  Sharon E. Perl Performance assertion checking , 1993, SOSP '93.