Practical extraction techniques for Java

Reducing application size is important for software that is distributed via the internet, in order to keep download times manageable, and in the domain of embedded systems, where applications are often stored in (Read-Only or Flash) memory. This paper explores extraction techniques such as the removal of unreachable methods and redundant fields, inlining of method calls, and transformation of the class hierarchy for reducing application size. We implemented a number of extraction techniques in Jax, an application extractor for Java, and evaluated their effectiveness on a set of large Java applications. We found that, on average, the class file archives for these benchmarks were reduced to 37.5% of their original size. Modeling dynamic language features such as reflection, and extracting software distributions other than complete applications requires additional user input. We present a uniform approach for supplying this input that relies on MEL, a modular specification language. We also discuss a number of issues and challenges associated with the extraction of embedded systems applications.

[1]  Koen De Bosschere,et al.  Combining Global Code and Data Compaction , 2001, OM '01.

[2]  David Grove,et al.  Call graph construction in object-oriented languages , 1997, OOPSLA '97.

[3]  Jens Palsberg,et al.  Scalable propagation-based call graph construction algorithms , 2000, OOPSLA '00.

[4]  James Gosling The Java Language Specification - Second Edition , 2000 .

[5]  Frank Tip,et al.  A study of dead data members in C++ applications , 1998, PLDI.

[6]  Michael Franz,et al.  Slim binaries , 1997, CACM.

[7]  Gregg Rothermel,et al.  A safe, efficient regression test selection technique , 1997, TSEM.

[8]  Olin Shivers,et al.  Control-flow analysis of higher-order languages of taming lambda , 1991 .

[9]  William Pugh,et al.  Compressing Java class files , 1999, PLDI '99.

[10]  Derek Rayside,et al.  Extracting Java library subsets for deployment on embedded systems , 1999, Proceedings of the Third European Conference on Software Maintenance and Reengineering (Cat. No. PR00090).

[11]  Christopher W. Fraser,et al.  Analyzing and compressing assembly code , 1984, SIGPLAN '84.

[12]  Susan Horwitz,et al.  Fast and accurate flow-insensitive points-to analysis , 1997, POPL '97.

[13]  Koen De Bosschere,et al.  Sifting out the mud: low level C++ code reuse , 2002, OOPSLA '02.

[14]  Derek Rayside,et al.  Extracting Java library subsets for deployment on embedded systems , 2002, Sci. Comput. Program..

[15]  Dirk Grunwald,et al.  Reducing indirect function call overhead in C++ programs , 1994, POPL '94.

[16]  Barbara G. Ryder,et al.  Data-Flow-Based Virtual Function Resolution , 1996, SAS.

[17]  Vitaly Feldman,et al.  Sealed calls in Java packages , 2000, OOPSLA '00.

[18]  David Ungar,et al.  Sifting out the gold: delivering compact applications from an exploratory object-oriented programming environment , 1994, OOPSLA 1994.

[19]  Laurie J. Hendren,et al.  Practical virtual method call resolution for Java , 2000, OOPSLA '00.

[20]  Bjorn De Sutter,et al.  Compiler techniques for code compaction , 2000, TOPL.

[21]  Frank Yellin,et al.  The Java Virtual Machine Specification , 1996 .

[22]  David Ungar,et al.  Sifting out the gold: delivering compact applications from an exploratory object-oriented programming environment , 1994, OOPSLA '94.

[23]  Ole Agesen,et al.  Concrete type inference: delivering object-oriented applications , 1995 .

[24]  David Holmes,et al.  The Java Programming Language, Third Edition , 2000 .

[25]  Jong-Deok Choi,et al.  Slicing class hierarchies in C++ , 1996, OOPSLA '96.

[26]  Dustin R. Callaway Inside Servlets: Server-Side Programming for the Java Platform with Other , 1999 .

[27]  Michael Franz,et al.  A Tree-Based Alternative to Java Byte-Codes , 1999, International Journal of Parallel Programming.

[28]  Guy L. Steele,et al.  Java Language Specification, Second Edition: The Java Series , 2000 .

[29]  Keith D. Cooper,et al.  Enhanced code compression for embedded RISC processors , 1999, PLDI '99.

[30]  R. N. Horspool,et al.  JAZZ: an efficient compressed format for Java archive files , 1998, CASCON.

[31]  Frank Tip,et al.  Extracting library-based object-oriented applications , 2000, SIGSOFT '00/FSE-8.

[32]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[33]  Emden R. Gansner,et al.  A C++ data model supporting reachability analysis and dead code detection , 1997, ESEC '97/FSE-5.

[34]  Guy L. Steele,et al.  The Java Language Specification , 1996 .

[35]  Frank Tip,et al.  Class hierarchy specialization , 1997, OOPSLA '97.

[36]  David F. Bacon,et al.  Fast and effective optimization of statically typed object-oriented languages , 1997 .

[37]  Derek Rayside,et al.  Compact Java binaries for embedded systems , 1999, CASCON.

[38]  Robert Scheifler,et al.  An analysis of inline substitution for a structured programming language , 1977, CACM.

[39]  David F. Bacon,et al.  Fast static analysis of C++ virtual function calls , 1996, OOPSLA '96.

[40]  R. Nigel Horspool,et al.  Tailored compression of Java class files , 1998 .

[41]  Urs Hölzle,et al.  Type feedback vs. concrete type inference: a comparison of optimization techniques for object-oriented languages , 1995, OOPSLA.

[42]  R. Nigel Horspool,et al.  Tailored Compression of Java Class Files , 1998, Softw. Pract. Exp..

[43]  Michael Thies,et al.  A Closer Look at Inter-library Dependencies in Java-Software , 1999, Java-Informations-Tage.

[44]  Laurie J. Hendren,et al.  A Framework for Optimizing Java Using Attributes , 2001, CC.

[45]  Christopher W. Fraser,et al.  Code compression , 1997, PLDI '97.

[46]  Laurie J. Hendren,et al.  A framework for optimizing Java using attributes , 2000, CASCON.

[47]  Ken Arnold,et al.  The Java Programming Language , 1996 .

[48]  Emden R. Gansner,et al.  AC++ Data Model Supporting Reachability Analysis and Dead Code Detection , 1998, IEEE Trans. Software Eng..

[49]  Urs Hölzle,et al.  Eliminating Virtual Function Calls in C++ Programs , 1996, ECOOP.

[50]  Bjorn De Sutter Compactie van programma's na het linken , 2002 .

[51]  Frank Tip,et al.  Change impact analysis for object-oriented programs , 2001, PASTE '01.

[52]  Frank Tip,et al.  Practical experience with an application extractor for Java , 1999, OOPSLA '99.

[53]  Chandra Krintz,et al.  Overlapping execution with transfer using non-strict execution for mobile programs , 1998, ASPLOS VIII.

[54]  David Grove,et al.  Optimization of Object-Oriented Programs Using Static Class Hierarchy Analysis , 1995, ECOOP.

[55]  Ulrik Pagh Schultz,et al.  Java bytecode compression for low-end embedded systems , 2000, TOPL.