A performance study of data layout techniques for improving data locality in refinement-based pathfinding

The widening gap between processor speed and memory latency increases the importance of crafting data structures and algorithms to exploit temporal and spatial locality. Refinement-based pathfinding algorithms, such as Classic Refinement (CR), find quality paths in very large sparse graphs where traditional search techniques fail to generate paths in acceptable time. In this paper, we present a performance evaluation study of three simple data structure transformations aimed at improving the data reference locality of CR. These transformations are robust to changes in computer architecture and the degree of compiler optimization. We test our alternative designs on four contemporary architectures, using two compilers for each machine. In our experiments, the application of these techniques results in performance improvements of up to 67% with consistent improvements above 15%. Analysis reveals that these improvements stem from improved data reference locality at the page level and to a lesser extent at the cache line level.

[1]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: application in VLSI domain , 1997, DAC.

[2]  Michael A. Bender,et al.  Cache-oblivious priority queue and graph algorithm applications , 2002, STOC '02.

[3]  Charles E. Leiserson,et al.  Cache-Oblivious Algorithms , 2003, CIAC.

[4]  Jeffrey Scott Vitter,et al.  External memory algorithms and data structures: dealing with massive data , 2001, CSUR.

[5]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[6]  T. M. Murali,et al.  I/O-efficient algorithms for contour-line extraction and planar graph blocking , 1998, SODA '98.

[7]  Norman E. Gibbs,et al.  A Comparison of Several Bandwidth and Profile Reduction Algorithms , 1976, TOMS.

[8]  Robert C. Holte,et al.  Speeding up Problem Solving by Abstraction: A Graph Oriented Approach , 1996, Artif. Intell..

[9]  JOSEP DÍAZ,et al.  A survey of graph layout problems , 2002, CSUR.

[10]  James R. Larus,et al.  Cache-conscious structure layout , 1999, PLDI '99.

[11]  Jop F. Sibeyn,et al.  Algorithms for Memory Hierarchies: Advanced Lectures , 2003 .

[12]  Lars Arge,et al.  On external-memory MST, SSSP and multi-way planar graph separation , 2000, J. Algorithms.

[13]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[14]  Guang R. Gao,et al.  Speculative Prefetching of Induction Pointers , 2001, CC.

[15]  P. Sadayappan,et al.  On improving the performance of sparse matrix-vector multiplication , 1997, Proceedings Fourth International Conference on High-Performance Computing.

[16]  W. Press,et al.  Numerical Recipes in C++: The Art of Scientific Computing (2nd edn)1 Numerical Recipes Example Book (C++) (2nd edn)2 Numerical Recipes Multi-Language Code CD ROM with LINUX or UNIX Single-Screen License Revised Version3 , 2003 .

[17]  Gerth Stølting Brodal,et al.  On the limits of cache-obliviousness , 2003, STOC '03.

[18]  Stuart J. Russell Efficient Memory-Bounded Search Methods , 1992, ECAI.

[19]  Ulrich Meyer,et al.  Theory and Practice of Time-Space Trade-Offs in Memory Limited Search , 2001, KI/ÖGAI.

[20]  Hubertus Franke,et al.  Multiple page size support in the Linux kernel , 2002 .

[21]  Richard E. Korf Delayed Duplicate Detection: Extended Abstract , 2003, IJCAI.

[22]  Barry Brumitt,et al.  Framed-quadtree path planning for mobile robots operating in sparse environments , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[23]  Chandra Krintz,et al.  Cache-conscious data placement , 1998, ASPLOS VIII.

[24]  Edward F. Grove,et al.  External-memory graph algorithms , 1995, SODA '95.

[25]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[26]  Martin Hirzel,et al.  Dynamic hot data stream prefetching for general-purpose programs , 2002, PLDI '02.

[27]  Ulrich Meyer,et al.  Algorithms for Memory Hierarchies , 2003, Lecture Notes in Computer Science.

[28]  Mark. Deloura,et al.  Game Programming Gems , 2000 .

[29]  José Nelson Amaral,et al.  Crafting Data Structures: A Study of Reference Locality in Refinement-Based Pathfinding , 2003, HiPC.

[30]  Reid G. Simmons,et al.  Recent progress in local and global traversability for planetary rovers , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[31]  Stefan Edelkamp,et al.  Localizing A* , 2000, AAAI/IAAI.

[32]  Arvind M. Patel,et al.  Partitioning for VLSI Placement Problems , 1981, 18th Design Automation Conference.

[33]  A. Pinar,et al.  Improving Performance of Sparse Matrix-Vector Multiplication , 1999, ACM/IEEE SC 1999 Conference (SC'99).