Guided linking: dynamic linking without the costs

Dynamic linking is extremely common in modern software systems, thanks to the flexibility and space savings it offers. However, this flexibility comes at a cost: it’s impossible to perform interprocedural optimizations that involve calls to a dynamic library. The basic problem is that the run-time behavior of the dynamic linker can’t be predicted at compile time, so the compiler can make no assumptions about how such calls will behave. This paper introduces guided linking, a technique for optimizing dynamically linked software when some information about the dynamic linker’s behavior is known in advance. The developer provides an arbitrary set of programs, libraries, and plugins to our tool, along with constraints that limit the possible dynamic linking behavior of the software. By taking advantage of the constraints, our tool enables any existing optimization to be applied across dynamic linking boundaries. For example, the NoOverride constraint can be applied to a function when the developer knows it will never be overridden with a different definition at run time; guided linking then enables the function to be inlined into its callers in other libraries. We also introduce a novel code size optimization that deduplicates identical functions even across different parts of the software set. By applying guided linking to the Python interpreter and its dynamically loaded modules, supplying the constraint that no other programs or modules will be used, we increase speed by an average of 9%. By applying guided linking to a dynamically linked distribution of Clang and LLVM, and using the constraint that no other software will use the LLVM libraries, we can increase speed by 5% and reduce file size by 13%. If we relax the constraint to allow other software to use the LLVM libraries, we can still increase speed by 5% and reduce file size by 5%. If we use guided linking to combine 11 different versions of the Boost library, using minimal constraints, we can reduce the total library size by 57%.

[1]  Björn Franke,et al.  Exploiting function similarity for code size reduction , 2014, LCTES '14.

[2]  Stephen Kell,et al.  The missing link: explaining ELF static linking, semantically , 2016, OOPSLA.

[3]  Dennis F. Kibler,et al.  Improving and refining programs by program manipulation , 1976, ACM '76.

[4]  Hashim Sharif,et al.  Trimmer: Application Specialization for Code Debloating , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[5]  Alexander Aiken,et al.  Semantic program alignment for equivalence checking , 2019, PLDI.

[6]  Natarajan Shankar,et al.  Automated software winnowing , 2015, SAC.

[7]  Sung-Soo Lim,et al.  Performance characterization of prelinking and preloadingfor embedded systems , 2007, EMSOFT '07.

[8]  Christian S. Collberg,et al.  SLINKY: Static Linking Reloaded , 2005, USENIX Annual Technical Conference, General Track.

[9]  Susan Horwitz,et al.  Using Slicing to Identify Duplication in Source Code , 2001, SAS.

[10]  Christopher W. Fraser,et al.  Analyzing and compressing assembly code , 1984, SIGPLAN '84.

[11]  David W. Wall,et al.  A practical system fljr intermodule code optimization at link-time , 1993 .

[12]  Michael Ferdman,et al.  Architectural Support for Dynamic Linking , 2015, ASPLOS.

[13]  Bjorn De Sutter,et al.  Compiler techniques for code compaction , 2000, TOPL.

[14]  Ulrich Drepper,et al.  How To Write Shared Libraries , 2005 .

[15]  Keith D. Cooper,et al.  Enhanced code compression for embedded RISC processors , 1999, PLDI '99.

[16]  Martin Hopkins,et al.  An overview of the PL.8 compiler , 1982, SIGP.

[17]  Andre Pawlowski,et al.  Towards Automated Application-Specific Software Stacks , 2019, ESORICS.

[18]  Koen De Bosschere,et al.  Combining Global Code and Data Compaction , 2001, OM '01.

[19]  Michael N. Nelson,et al.  High Performance Dynamic Linking Through Caching , 1993, USENIX Summer.

[20]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[21]  Santosh Nagarakatte,et al.  Automatic Equivalence Checking for Assembly Implementations of Cryptography Libraries , 2019, 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[22]  Koen De Bosschere,et al.  Link-time binary rewriting techniques for program compaction , 2005, TOPL.

[23]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[24]  Murray Cole,et al.  Function Merging by Sequence Alignment , 2019, 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[25]  Manjeet Dahiya,et al.  Black-Box Equivalence Checking Across Compiler Optimizations , 2017, APLAS.

[26]  Di Jin,et al.  Nibbler: debloating binary shared libraries , 2019, ACSAC.

[27]  Will Dietz,et al.  Software multiplexing: share your libraries and statically link them too , 2018, Proc. ACM Program. Lang..

[28]  KennedyKen,et al.  The impact of interprocedural analysis and optimization in the Rn programming environment , 1986 .