Automatic Restructuring of Linked Data Structures

The memory subsystem is one of the major performance bottlenecks in modern computer systems. While much effort is spent on the optimization of codes which access data regularly, not all codes will do so. Programs using pointer linked data structures are notorious for producing such so called irregular memory access patterns. In this paper, we present a compilation and run-time framework that enables fully automatic restructuring of pointer-linked data structures for type-unsafe languages, such as C. The restructuring framework is based on run-time restructuring using run-time trace information. The compiler transformation chain first identifies disjoint data structures that are stored in type-homogeneous memory pools. Access to these pools is traced and from these run-time traces, a permutation vector is derived. The memory pool is restructured at run-time using this permutation, after which all pointers (both stack and heap) that refer to the restructured pool must be updated. While the run-time tracing incurs a considerable overhead, we show that restructuring pointer-linked data structures can yield substantial speedups and that in general, the incurred overhead is compensated for by the performance improvements.

[1]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[2]  Stephen Curial,et al.  MPADS: memory-pooling-assisted data splitting , 2008, ISMM '08.

[3]  Lawrence Rauchwerger,et al.  Hybrid analysis: static & dynamic memory reference analysis , 2002, ICS '02.

[4]  Vikram S. Adve,et al.  Automatic pool allocation: improving performance by controlling data structure layout in the heap , 2005, PLDI '05.

[5]  Michael A. Bender,et al.  An adaptive packed-memory array , 2007, TODS.

[6]  Vikram S. Adve,et al.  Macroscopic Data Structure Analysis and Optimization , 2005 .

[7]  Joel H. Saltz,et al.  Identifying DEF/USE Information of Statements that Construct and Traverse Dynamic Recursive Data Structures , 1997, LCPC.

[8]  Michael Rodeh,et al.  Virtual Cache Line: A New Technique to Improve Cache Exploitation for Recursive Data Structures , 1999, CC.

[9]  Joel H. Saltz,et al.  Run-time parallelization and scheduling of loops , 1989, SPAA '89.

[10]  Vikram S. Adve,et al.  Automatic pool allocation for disjoint data structures , 2003, MSP '02.

[11]  Erwin M. Bakker,et al.  Characterizing the performance penalties induced by irregular code using pointer structures and indirection arrays on the intel core 2 architecture , 2009, CF '09.

[12]  Vikram S. Adve,et al.  Transparent pointer compression for linked data structures , 2005, MSP '05.