Java server performance: A case study of building efficient, scalable Jvms

The importance of the JavaTM platform has shifted from a client-centered paradigm to the server. In particular, the Java language has matured into a viable programming model for server applications. Correspondingly, the requirements on the Java virtual machine (Jvm) have shifted. This paper details the serverspecific performance enhancements made to the core Jvm and just-in-time (JIT) compiler, which have allowed the IBM Developer Kits that implement Java code for Intel processors to become industry performance leaders. The paper focuses on synchronization implementation and granularity improvements that have greatly increased the scalability of the Java language on multiprocessor machines. Focus is also given to memory management, specifically, object allocation, garbage collection, and heap management. Details of communication and connection scaling are also provided. Finally, server-specific enhancements to the JIT compiler are discussed. All component enhancements in the paper are explained, and their performance implications are quantified with results from representative multithreaded server workloads. The paper summarizes work from across IBM. The authors' specific contributions include the three-tier spin lock, the thread local heap and freelist merge, the dynamic heap growth algorithm, bitwise sweep, compaction avoidance, and the suite of network enhancements.

[1]  Wai Yee Peter Wong,et al.  The evolution of a high-performing Java virtual machine , 2000, IBM Syst. J..

[2]  David Detlefs,et al.  Garbage collection and local variable type-precision and liveness in Java virtual machines , 1998, PLDI.

[3]  Michael H. Kalantar,et al.  Java server benchmarks , 2000, IBM Syst. J..

[4]  Paul R. Wilson,et al.  Dynamic Storage Allocation: A Survey and Critical Review , 1995, IWMM.

[5]  Karsten Schwan,et al.  Experiments With Configurable Locks for Multiprocessors , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[6]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[7]  Thomas E. Anderson,et al.  The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[8]  Paul R. Wilson,et al.  Non-compacting memory allocation and real-time garbage collection , 1997 .

[9]  Toshiaki Yasue,et al.  Overview of the IBM Java Just-in-Time Compiler , 2000, IBM Syst. J..

[10]  Karsten Schwan,et al.  Improving performance by use of adaptive objects: experimentation with a configurable multiprocessor thread package , 1993, [1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing.

[11]  Mauricio J. Serrano,et al.  Thin locks: featherweight synchronization for Java , 1998, PLDI '98.

[12]  Chris J. Cheney A nonrecursive list compacting algorithm , 1970, Commun. ACM.

[13]  Anna R. Karlin,et al.  Empirical studies of competitve spinning for a shared-memory multiprocessor , 1991, SOSP '91.

[14]  Richard Jones,et al.  Garbage collection , 1996 .

[15]  John Slice,et al.  The Parallelization of UNIX System V Release 4.0 , 1991, USENIX Winter.