Selective Profiling for OS Scalability Study on Multicore Systems

With more cores becoming available in each future generation of microprocessors (i.e. the well-known Moore's Law), scalability is becoming an increasingly important issue. Scalability of the operating system, in particular, is critical to such systems. To study OS scalability and many other issues related to OS performance on multicore systems, software and hardware profilers are indispensable tools. Hardware profilers give detailed performance information on hardware components with minimal overhead, but are difficult to relate the collected information to specific software events. Hence, most of the OS profiling tools are software based. Such profilers often incur significant overheads if more precise measurements are required. The situation is exacerbated further because most of these tools have scalability issue themselves in that their overheads could grow more than proportionately to the number of cores and/or the number of threads in a system. Our results showed that such overheads not only cause much longer execution time (often by orders of magnitude), but also perturb program execution and produce misleading profiling results. In order to mitigate such problems, we propose an approach, called selective profiling, that uses a mix of profiling tools with different levels of precision and overheads to produce the desired results with tolerable overhead. In selective-profiling, potential scalability bottlenecks and hotspots are first identified by low-overhead tracers. More detailed information of the selected bottlenecks and hotspots are then collected by a sampler with more precision but heavier overheads. Since the sampler only focuses on the selected bottlenecks and hotspots instead of the entire program, the overhead can be substantially reduced. Using such an approach on some OS benchmarks, we show that the proposed selective-profiling approach can efficiently identify their scalability bottlenecks with much reduced overheads.

[1]  Y. N. Srikant,et al.  A programmable hardware path profiler , 2005, International Symposium on Code Generation and Optimization.

[2]  Lance M. Berc,et al.  Continuous profiling: where have all the cycles gone? , 1997, ACM Trans. Comput. Syst..

[3]  Arnaldo Carvalho de Melo,et al.  The New Linux ’ perf ’ Tools , 2010 .

[4]  Jeffrey Dean,et al.  ProfileMe: hardware support for instruction-level profiling on out-of-order processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[5]  John L. Henning SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.

[6]  Martin Hirzel,et al.  Bursty Tracing: A Framework for Low-Overhead Temporal Profiling , 2001 .

[7]  R. Schaller,et al.  Moore's law: past, present and future , 1997 .

[8]  Matthias Hauswirth,et al.  Low-overhead memory leak detection using adaptive statistical profiling , 2004, ASPLOS XI.

[9]  Karsten Schwan,et al.  SysProf: Online Distributed Behavior Diagnosis through Fine-grain System Monitoring , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).

[10]  Toni Cortes,et al.  PARAVER: A Tool to Visualize and Analyze Parallel Code , 2007 .

[11]  Jacob Benesty,et al.  Pearson Correlation Coefficient , 2009 .

[12]  James R. Larus,et al.  Efficient path profiling , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[13]  Brendan Gregg,et al.  Dtrace: Dynamic Tracing in Oracle Solaris, Mac OS X and Freebsd , 2011 .

[14]  M. Desnoyers,et al.  The LTTng tracer: A low impact performance and behavior monitor for GNU/Linux , 2006 .