JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA

A recent trend in mainstream desktop systems is the use of general-purpose graphics processor units (GPGPUs) to obtain order-of-magnitude performance improvements. CUDA has emerged as a popular programming model for GPGPUs for use by C/C++ programmers. Given the widespread use of modern object-oriented languages with managed runtimes like Java and C#, it is natural to explore how CUDA-like capabilities can be made accessible to those programmers as well. In this paper, we present a programming interface called JCUDA that can be used by Java programmers to invoke CUDA kernels. Using this interface, programmers can write Java codes that directly call CUDA kernels, and delegate the responsibility of generating the Java-CUDA bridge codes and host-device data transfer calls to the compiler. Our preliminary performance results show that this interface can deliver significant performance improvements to Java programmers. For future work, we plan to use the JCUDA interface as a target language for supporting higher level parallel programming languages like X10 and Habanero-Java.

[1]  Sheng Liang,et al.  Java Native Interface: Programmer's Guide and Specification , 1999 .

[2]  Michael R. Clarkson,et al.  Polyglot: An Extensible Compiler Framework for Java , 2003, CC.

[3]  Vivek Sarkar,et al.  The Jikes Research Virtual Machine project: Building an open-source research community , 2005, IBM Syst. J..

[4]  Clifford A. Lynch,et al.  Information Networking , 1994 .

[5]  Rudolf Eigenmann,et al.  OpenMP to GPGPU: a compiler framework for automatic translation and optimization , 2009, PPoPP '09.

[6]  Ondrej Lhoták,et al.  Automatic parallelization for graphics processing units , 2009, PPPJ '09.

[7]  J. Mark Bull,et al.  Benchmarking Java against C and Fortran for scientific applications , 2001, JGI '01.

[8]  Michael Philippsen,et al.  Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande , 2001 .

[9]  Mike Houston Stream computing , 2008, SIGGRAPH '08.

[10]  Alan Chun Wai Leung,et al.  Automatic Parallelization for Graphics Processing Units in JikesRVM , 2008 .

[11]  Charles Slocomb Proceedings of the 2001 ACM/IEEE conference on Supercomputing, Denver, CO, USA, November 10-16, 2001, CD-ROM , 2001, SC.

[12]  Vivek Sarkar,et al.  Language Extensions in Support of Compiler Parallelization , 2007, LCPC.

[13]  Samuel P. Midkiff,et al.  Java programming for high-performance numerical computing , 2000, IBM Syst. J..

[14]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[15]  L.A. Smith,et al.  A Parallel Java Grande Benchmark Suite , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[16]  Sheng Liang,et al.  Java Native Interface: Programmer's Guide and Reference , 1999 .

[17]  Geoffrey C. Fox,et al.  MPJ: MPI-like message passing for Java , 2000, Concurr. Pract. Exp..

[18]  Vivek Sarkar,et al.  X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.