Vectorization of apply to reduce interpretation overhead of R

R is a popular dynamic language designed for statistical computing. Despite R's huge user base, the inefficiency in R's language implementation becomes a major pain-point in everyday use as well as an obstacle to apply R to solve large scale analytics problems. The two most common approaches to improve the performance of dynamic languages are: implementing more efficient interpretation strategies and extending the interpreter with Just-In-Time (JIT) compiler. However, both approaches require significant changes to the interpreter, and complicate the adoption by development teams as a result. This paper presents a new approach to improve execution efficiency of R programs by vectorizing the widely used Apply class of operations. Apply accepts two parameters: a function and a collection of input data elements. The standard implementation of Apply iteratively invokes the input function with each element in the data collection. Our approach combines data transformation and function vectorization to convert the looping-over-data execution of the standard Apply into a single invocation of a vectorized function that contains a sequence of vector operations over the input data. This conversion can significantly speed-up the execution of Apply operations in R by reducing the number of interpretation steps. We implemented the vectorization transformation as an R package. To enable the optimization, all that is needed is to invoke the package, and the user can use a normal R interpreter without any changes. The evaluation shows that the proposed method delivers significant performance improvements for a collection of data analysis algorithm benchmarks. This is achieved without any native code generation and using only a single-thread of execution.

[1]  Na Li,et al.  Snow: A Parallel Computing Framework for the R System , 2009, International Journal of Parallel Programming.

[2]  Samuele Pedroni,et al.  PyPy's approach to virtual machine construction , 2006, OOPSLA '06.

[3]  Jan Vitek,et al.  A fast abstract syntax tree interpreter for R , 2014, VEE '14.

[4]  Ashlee Vance,et al.  Data Analysts Captivated by R's Power , 2009 .

[5]  Tatiana Shpeisman,et al.  River trail: a path to parallelism in JavaScript , 2013, OOPSLA.

[6]  Erik Brynjolfsson,et al.  Big data: the management revolution. , 2012, Harvard business review.

[7]  Pat Hanrahan,et al.  Riposte: A trace-driven compiler and parallel VM for vector code in R , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[8]  Ken Kennedy,et al.  Conversion of control dependence to data dependence , 1983, POPL '83.

[9]  Bingsheng He,et al.  Optimizing the MapReduce framework on Intel Xeon Phi coprocessor , 2013, 2013 IEEE International Conference on Big Data.

[10]  Matthew B. Dwyer,et al.  Proceedings of the ACM international conference on Object oriented programming systems languages and applications , 2010 .

[11]  Brian Hackett,et al.  Fast and precise hybrid type inference for JavaScript , 2012, PLDI '12.

[12]  Keshav Pingali,et al.  A case for source-level transformations in MATLAB , 1999, DSL '99.

[13]  David Gregg,et al.  Dynamic interpretation for dynamic scripting languages , 2010, CGO '10.

[14]  Péricles Rafael Oliveira Alves,et al.  Just-in-time value specialization , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[15]  David A. Padua,et al.  An Evaluation of Vectorizing Compilers , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[16]  Craig Chambers,et al.  An efficient implementation of SELF, a dynamically-typed object-oriented language based on prototypes , 1989, OOPSLA '89.

[17]  Arthur B. Maccabe,et al.  The program dependence web: a representation supporting control-, data-, and demand-driven interpretation of imperative languages , 1990, PLDI '90.

[18]  Guilherme Ottoni,et al.  The hiphop virtual machine , 2014, OOPSLA.

[19]  David Gregg,et al.  A practical solution for scripting language compilers , 2009, SAC '09.

[20]  David A. Padua,et al.  Techniques for the translation of MATLAB programs into Fortran 90 , 1999, TOPL.

[21]  Toshio Nakatani,et al.  On the benefits and pitfalls of extending a statically typed language JIT compiler for dynamic scripting languages , 2012, OOPSLA '12.

[22]  David A. Padua,et al.  MaJIC: compiling MATLAB for speed and responsiveness , 2002, PLDI '02.

[23]  Sebastian Hack,et al.  Whole-function vectorization , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[24]  Qi Gao,et al.  The HipHop compiler for PHP , 2012, OOPSLA '12.

[25]  Luke Tierney Compiling R: A Preliminary Report , 2001 .

[26]  Christian Wimmer,et al.  One VM to rule them all , 2013, Onward!.

[27]  David A. Padua,et al.  Optimizing R VM: Allocation Removal and Path Length Reduction via Interpreter-level Specialization , 2014, CGO '14.

[28]  Pat Hanrahan,et al.  Just-in-time Length Specialization of Dynamic Vector Code , 2014, ARRAY@PLDI.

[29]  Mason Chang,et al.  Trace-based just-in-time type specialization for dynamic languages , 2009, PLDI '09.

[30]  Jan Vitek,et al.  Evaluating the Design of the R Language - Objects and Functions for Data Analysis , 2012, ECOOP.

[31]  Samuel P. Midkiff,et al.  RABID -- A General Distributed R Processing Framework Targeting Large Data-Set Problems , 2013, 2013 IEEE International Congress on Big Data.

[32]  Kunle Olukotun,et al.  Map-Reduce for Machine Learning on Multicore , 2006, NIPS.

[33]  M. Pharr,et al.  ispc: A SPMD compiler for high-performance CPU programming , 2012, 2012 Innovative Parallel Computing (InPar).