Fast PGAS connected components algorithms

Irregular graph algorithms for distributed-memory systems are hard to implement and optimize. Recent developments in PGAS languages make the implementation of irregular algorithms easier. In this paper we present our study of PRAM-based parallel connected components algorithm implemented in UPC for distributed-memory systems, and discuss optimization techniques for such settings. Our optimized version achieved more than 100 times speedup over the straight-forward implementation. Remarkable speedups are also achieved over the best SMP implementation for the same input. As the memory access patterns of these algorithms are representative of those of many other PRAM algorithms, we expect our techniques applicable to optimizing a wide range of PRAM graph algorithms on distributed-memory machines.

[1]  Ajay K. Royyuru,et al.  Blue Gene: A vision for protein science using a petaflop supercomputer , 2001, IBM Syst. J..

[2]  David A. Bader,et al.  A fast, parallel spanning tree algorithm for symmetric multiprocessors , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[3]  John Greiner,et al.  A comparison of parallel algorithms for connected components , 1994, SPAA '94.

[4]  David A. Bader,et al.  SIMPLE: A Methodology for Programming High Performance Algorithms on Clusters of Symmetric Multiprocessors (SMPs) , 1998, J. Parallel Distributed Comput..

[5]  Fabrizio Petrini,et al.  Efficient Breadth-First Search on the Cell/BE Processor , 2008, IEEE Transactions on Parallel and Distributed Systems.

[6]  Joseph JáJá,et al.  Designing Practical Efficient Algorithms for Symmetric Multiprocessors , 1999, ALENEX.

[7]  John Greiner,et al.  AD-A 270 551 A Comparison of Data-Parallel Algorithms for Connected Components , 1994 .

[8]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[9]  Edmond Chow,et al.  A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[10]  Dilip V. Sarwate,et al.  Computing connected components on parallel computers , 1979, CACM.

[11]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[12]  David A. Bader,et al.  Designing irregular parallel algorithms with mutual exclusion and lock-free protocols , 2006, J. Parallel Distributed Comput..

[13]  Uzi Vishkin,et al.  An O(log n) Parallel Connectivity Algorithm , 1982, J. Algorithms.