Parallelizing calling context profiling in virtual machines on multicores

The Calling Context Tree (CCT) is a prevailing datastructure for calling context profiling. As generating a complete CCT reflecting every method call is expensive, recent research has focused on efficiently approximating the CCT with sampling techniques. However, for tasks such as debugging, testing, and reverse engineering, complete and accurate CCTs are often needed. In this paper, we introduce the ParCCT, a novel approach to parallelizing application code and CCT generation on multicores. Each thread maintains a shadow stack and generates "packets" of method calls and returns that correspond to partial CCTs. Each packet includes a copy of the shadow stack, indicating the calling context of the first method call in the packet. Hence, packets are independent of each other and can be processed out-of-order and in parallel in order to update the CCT. Our portable and extensible implementation targets standard Java Virtual Machines, thanks to instrumentation techniques that ensure complete bytecode coverage and efficiently support custom calling context representations. The ParCCT is more than 110% faster than a primitive, non-parallel approach to CCT construction, when more than two cores are available. This speedup stems both from reduced contention and from parallelization.

[1]  Jong-Deok Choi,et al.  Accurate, efficient, and adaptive calling context profiling , 2006, PLDI '06.

[2]  William G. Griswold,et al.  An Overview of AspectJ , 2001, ECOOP.

[3]  Michael D. Bond,et al.  Probabilistic calling context , 2007, OOPSLA.

[4]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[5]  James R. Larus,et al.  Exploiting hardware performance counters with flow and context sensitive profiling , 1997, PLDI '97.

[6]  Walter Binder,et al.  Platform-independent profiling in a virtual execution environment , 2009 .

[7]  Stephen J. Fink,et al.  The Jalapeño virtual machine , 2000, IBM Syst. J..

[8]  David Detlefs,et al.  Garbage-first garbage collection , 2004, ISMM '04.

[9]  Walter Binder,et al.  Flexible calling context reification for aspect-oriented programming , 2009, AOSD '09.

[10]  Jeremy Manson,et al.  The Java memory model , 2005, POPL '05.

[11]  Michael D. Ernst,et al.  ReCrash: Making Software Failures Reproducible by Preserving Object States , 2008, ECOOP.

[12]  Guy L. Steele,et al.  Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley)) , 2005 .

[13]  Walter Binder,et al.  Aspect weaving in standard Java class libraries , 2008, PPPJ '08.

[14]  J. Michael Spivey,et al.  Fast, accurate call graph profiling , 2004, Softw. Pract. Exp..

[15]  Guy L. Steele,et al.  The Java Language Specification , 1996 .

[16]  Walter Binder,et al.  Advanced Java bytecode instrumentation , 2007, PPPJ.

[17]  Walter Binder,et al.  CCCP: complete calling context profiling in virtual execution environments , 2009, PEPM '09.

[18]  Dirk Grunwald,et al.  Shadow Profiling: Hiding Instrumentation Costs with Parallelism , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[19]  John Whaley,et al.  A portable sampling-based profiler for Java virtual machines , 2000, JAVA '00.

[20]  Matthew Arnold,et al.  A framework for reducing the cost of instrumented code , 2001, PLDI '01.

[21]  Simon L. Peyton Jones,et al.  Parallel generational-copying garbage collection with a block-structured heap , 2008, ISMM '08.

[22]  Cristina V. Lopes,et al.  Aspect-oriented programming , 1999, ECOOP Workshops.

[23]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[24]  Cristina V. Lopes,et al.  Aspect-oriented programming , 1999, ECOOP Workshops.