Scalable cross-module optimization

Large applications are typically partitioned into separately compiled modules. Large performance gains in these applications are available by optimizing across module boundaries. One barrier to applying crossmodule optimization (CMO) to large applications is the potentially enormous amount of time and space consumed by the optimization process.We describe a framework for scalable CMO that provides large gains in performance on applications that contain millions of lines of code. Two major techniques are described. First, careful management of in-memory data structures results in sub-linear memory occupancy when compared to the number of lines of code being optimized. Second, profile data is used to focus optimization effort on the performance-critical portions of applications. We also present practical issues that arise in deploying this framework in a production environment. These issues include debuggability and compatibility with existing development tools, such as make. Our framework is deployed in Hewlett-Packard's (HP) UNIX compiler products and speeds up shipped independent software vendors' applications by as much as 71%.

[1]  Mary Hall Managing interprocedural optimization , 1992 .

[2]  David Grove,et al.  Profile-guided receiver class prediction , 1995, OOPSLA.

[3]  Linda Torczon,et al.  Interprocedural optimization: eliminating unnecessary recompilation , 1986, SIGPLAN '86.

[4]  Mary E. S. Loomis Object Databases - The Essentials , 1994 .

[5]  Wei-Chung Hsu,et al.  Instruction scheduling for the HP PA-8000 , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[6]  STEPHEN RICHARDSON,et al.  Interprocedural optimization: Experimental results , 1989, Softw. Pract. Exp..

[7]  Andrew Ayers,et al.  Aggressive inlining , 1997, PLDI '97.

[8]  David W. Goodwin,et al.  Interprocedural dataflow analysis in an executable optimizer , 1997, PLDI '97.

[9]  Stuart I. Feldman,et al.  Make — a program for maintaining computer programs , 1979, Softw. Pract. Exp..

[10]  David W. Wall,et al.  Global register allocation at link time , 1986, SIGPLAN '86.

[11]  Anne M. Holler Compiler optimizations for the PA-8000 , 1997, Proceedings IEEE COMPCON 97. Digest of Papers.

[12]  Ken Kennedy,et al.  Parascope:a Parallel Programming Environment , 1988 .

[13]  David B. Whalley,et al.  Automatic isolation of compiler errors , 1994, TOPL.

[14]  Karl Pettis,et al.  Profile guided code positioning , 1990, PLDI '90.

[15]  Robert Metzger,et al.  Developing an interprocedural optimizing compiler , 1994, SIGP.

[16]  Craig Partridge,et al.  Improving UNIX kernel performance using profile based optimization , 1994 .

[17]  David Maier,et al.  Readings in Object-Oriented Database Systems , 1989 .