An evaluation of Java's I/O capabilities for high-performance computing

Java is quickly becoming the preferred language for writing distributed applications because of its inherent support for programming on distributed platforms. In particular, Java provides compile-time and run-time security, automatic garbage collection, inherent support for multithreading, support for persistent objects and object migration, and portability. Given these significant advantages of Java, there is a growing interest in using Java for high-performance computing applications. To be successful in the high-performance computing domain, however, Java must have the capability to efficiently handle the significant I/O requirements commonly found in high-performance computing applications. While there has been significant research in high-performance I/O using languages such as C, C++, and Fortran, there has been relatively little research into the I/O capabilities of Java. In this paper, we evaluate the I/O capabilities of Java for high-performance computing. We examine several approaches that attempt to provide high-performance I/O--many of which are not obvious at first glance--and investigate their performance in both parallel and multithreaded environments. We also provide suggestions for expanding the I/O capabilities of Java to better support the needs of high-performance computing applications.

[1]  Alok N. Choudhary,et al.  Improved parallel I/O via a two-phase run-time access strategy , 1993, CARN.

[2]  Alok N. Choudhary,et al.  Design and evaluation of primitives for parallel I/O , 1993, Supercomputing '93. Proceedings.

[3]  David Kotz,et al.  Dynamic file-access characteristics of a production parallel scientific workload , 1994, Proceedings of Supercomputing '94.

[4]  Alok N. Choudhary,et al.  High-performance I/O for massively parallel computers: problems and prospects , 1994, Computer.

[5]  David Kotz,et al.  Disk-directed I/O for MIMD multiprocessors , 1994, OSDI '94.

[6]  Anthony Skjellum,et al.  Using MPI - portable parallel programming with the message-parsing interface , 1994 .

[7]  M. Winslett,et al.  Server-Directed Collective I/O in Panda , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[8]  D.A. Reed,et al.  Input/Output Characteristics of Scalable Parallel Applications , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[9]  Dror G. Feitelson,et al.  Parallel I/O subsystems in massively parallel supercomputers , 1995, IEEE Parallel & Distributed Technology: Systems & Applications.

[10]  Rajeev Thakur,et al.  An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays , 1996, Sci. Program..

[11]  Rajeev Thakur,et al.  Passion: Optimized I/O for Parallel Applications , 1996, Computer.

[12]  Phillip M. Dickens A Performance Study of Two-Phase I/O , 1998, Euro-Par.

[13]  Rajeev Thakur,et al.  I/O in Parallel Applications: the Weakest Link , 1998, Int. J. High Perform. Comput. Appl..

[14]  Marc Najork,et al.  Performance limitations of the Java core libraries , 1999, JAVA '99.

[15]  Geoffrey C. Fox,et al.  Object serialization for marshalling data in a Java interface to MPI , 1999, JAVA '99.

[16]  Anthony Skjellum,et al.  Using MPI: portable parallel programming with the message-passing interface, 2nd Edition , 1999, Scientific and engineering computation series.

[17]  Tigris: A Java-based Cluster I/O System , 1999 .

[18]  George K. Thiruvathukal,et al.  Java on networks of workstations (JavaNOW): a parallel computing framework inspired by Linda and the Message Passing Interface (MPI) , 2000 .

[19]  David E. Culler,et al.  Jaguar: enabling efficient communication and I/O in Java , 2000 .