Performance analysis for languages hosted on the truffle framework

It is attractive to host new or existing language implementations on top of, or reusing components of, existing managed language runtimes such as the Java Virtual Machine (JVM) or the Microsoft Common Language Infrastructure (CLI). A benefit is that software development effort may be reduced, as only one managed language runtime needs to be optimised and maintained, instead of a separate compiler/runtime for each language implementation. For example, the Truffle framework combined with a JVM offers support for executing Javascript, Ruby, R, LLVM IR-compiled languages, etc., as well as supporting the execution of applications combining multiple programming languages within a Polyglot application. In trying to understand the runtime performance of Sulong (i.e. the Truffle project which enables LLVM IR execution), we found a lack of tools and guidance. A similar situation is found for benchmarks written in Ruby and R when executed as Truffle hosted languages. Further, it is non-trivial to relate performance back to the hosted language source code, and to determine whether JVM service overheads, such as garbage collection or JIT compilation, are significant. We describe how to visually analyse the performance of Truffle-hosted languages based on Flamegraphs, allowing time to be related to sampled call-stacks. We use the Linux tool perf and the JVM agent perf-map-agent, along with enhancements to the Graal JIT compiler that map sampled call-stacks onto JVM-hosted guest language source code. This paper demonstrates the ease and flexibility of using these modified tools, with low overhead during execution time. We also illustrate applicability of the techniques to understand the performance of Polyglot applications.

[1]  Sarah Mount,et al.  Virtual machine warmup blows hot and cold , 2016, Proc. ACM Program. Lang..

[2]  Christian Wimmer,et al.  Self-optimizing AST interpreters , 2012, DLS.

[3]  Matthias Hauswirth,et al.  Evaluating the accuracy of Java profilers , 2010, PLDI '10.

[4]  Hanspeter Mössenböck,et al.  Bringing low-level languages to the JVM: efficient execution of LLVM IR on Truffle , 2016, VMIL@SPLASH.

[5]  Christian Wimmer,et al.  One VM to rule them all , 2013, Onward!.

[6]  Jeremy Singer,et al.  JVM-hosted languages: they talk the talk, but do they walk the walk? , 2013, PPPJ.

[7]  Walter Binder,et al.  Characteristics of dynamic JVM languages , 2013, VMIL '13.

[8]  Adam Welc,et al.  Optimizing R language execution via aggressive speculation , 2016, DLS.

[9]  Wenguang Chen,et al.  Taming hardware event samples for FDO compilation , 2010, CGO '10.

[10]  Christian Wimmer,et al.  Practical partial evaluation for high-performance dynamic language runtimes , 2017, PLDI.

[11]  Hanspeter Mössenböck,et al.  Cross-language compiler benchmarking: are we fast yet? , 2016, DLS.

[12]  Per Larsen,et al.  An Efficient and Generic Event-based Profiler Framework for Dynamic Languages , 2015, PPPJ.

[13]  Hanspeter Mössenböck,et al.  Cross-Language Interoperability in a Multi-Language Runtime , 2018 .

[14]  Hanspeter Mössenböck,et al.  High-performance cross-language interoperability in a multi-language runtime , 2015, DLS.

[15]  Hanspeter Mössenböck,et al.  An intermediate representation for speculative optimizations in a dynamic compiler , 2013, VMIL '13.

[16]  Soramichi Akiyama,et al.  Quantitative Evaluation of Intel PEBS Overhead for Online System-Noise Analysis , 2017, ROSS@HPDC.