A Composable Array Function Interface for Heterogeneous Computing in Java

Heterogeneous computing has now become mainstream with virtually every desktop machines featuring accelerators such as Graphics Processing Units (GPUs). While heterogeneity offers the promise of high-performance and high-efficiency, it comes at the cost of huge programming difficulties. Languages and interfaces for programming such system tend to be low-level and require expert knowledge of the hardware in order to achieve its potential. A promising approach for programming such heterogeneous systems is the use of array programming. This style of programming relies on well known parallel patterns that can be easily translated into GPU or other accelerator code. However, only little work has been done on integrating such concepts in mainstream languages such as Java. In this work, we propose a new Array Function interface implemented with the new features from Java 8. While similar in spirit to the new Stream API of Java, our API follows a different design based on reusability and composability. We demonstrate that this API can be used to generate OpenCL code for a simple application. We present encouraging preliminary performance results showing the potential of our approach.

[1]  Moustafa Ghanem,et al.  Structured parallel programming , 1993, Proceedings of Workshop on Programming Models for Massively Parallel Computers.

[2]  David F. Bacon,et al.  Compiling a high-level language for GPUs: (via language support for architectures and compilers) , 2012, PLDI.

[3]  William Thies,et al.  An empirical characterization of stream programs and its implications for language and compiler design , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[4]  Hanspeter Mössenböck,et al.  An intermediate representation for speculative optimizations in a dynamic compiler , 2013, VMIL '13.

[5]  Sean Lee,et al.  NOVA: A Functional Language for Data Parallelism , 2014, ARRAY@PLDI.

[6]  Doug Lea,et al.  The java.util.concurrent synchronizer framework , 2005, Sci. Comput. Program..

[7]  Kurt Keutzer,et al.  Copperhead: compiling an embedded data parallel language , 2011, PPoPP '11.

[8]  Frank Mueller,et al.  Hidp: A hierarchical data parallel language , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[9]  Sergei Gorlatch,et al.  SkelCL: Enhancing OpenCL for High-Level Programming of Multi-GPU Systems , 2013, PaCT.

[10]  Manuel M. T. Chakravarty,et al.  Accelerating Haskell array codes with multicore GPUs , 2011, DAMP '11.

[11]  Philip C. Pratt-Szeliga,et al.  Rootbeer: Seamlessly Using GPUs from Java , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.

[12]  Tatiana Shpeisman,et al.  River trail: a path to parallelism in JavaScript , 2013, OOPSLA.

[13]  Sven-Bodo Scholz,et al.  Harnessing the Power of GPUs without Losing Abstractions in SAC and ArrayOL: A Comparative Study , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[14]  Ralf Lämmel,et al.  Google's MapReduce programming model - Revisited , 2007, Sci. Comput. Program..

[15]  Arch D. Robison,et al.  Structured Parallel Programming: Patterns for Efficient Computation , 2012 .