RANDOMIZED OPTIMAL LIST RANKING ON COARSE-GRAINED PARALLEL COMPUTERS WITH O(log p) COMMUNICATION PHASES

The cost of interprocessor communication has a substantial impact on execution time when implementing parallel algorithms on physical parallel computers. For coarse-grained parallel computers it is important to minimize the number of communication phases, in order to balance cost of communication with local computation time. We present a log p-phase optimal randomized parallel list ranking algorithm and its application to expression evaluation. These techniques address the general issue of model-independent parallel algorithm design.

[1]  Lawrence Snyder,et al.  Type architectures, shared memory, and the corollary of modest potential , 1986 .

[2]  John H. Reif,et al.  Synthesis of Parallel Algorithms , 1993 .

[3]  Andrew Rau-Chaplin,et al.  Scalable parallel geometric algorithms for coarse grained multicomputers , 1993, SCG '93.

[4]  Richard Cole,et al.  Deterministic Coin Tossing with Applications to Optimal Parallel List Ranking , 2018, Inf. Control..

[5]  Richard M. Karp,et al.  Parallel Algorithms for Shared-Memory Machines , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[6]  Gary L. Miller,et al.  Parallel tree contraction and its application , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[7]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[8]  Uzi Vishkin,et al.  Randomized Parallel Speedups for List Ranking , 1987, J. Parallel Distributed Comput..

[9]  Hui Li,et al.  Parallel sorting by over partitioning , 1994, SPAA '94.

[10]  Richard J. Anderson,et al.  A comparison of shared and nonshared memory models of parallel computation , 1991 .

[11]  Richard Cole,et al.  Faster Optimal Parallel Prefix Sums and List Ranking , 2011, Inf. Comput..

[12]  Leslie G. Valiant,et al.  General Purpose Parallel Architectures , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[13]  Xiaotie Deng,et al.  Efficient routing and message bounds for optimal parallel algorithms , 1995, Proceedings of 9th International Parallel Processing Symposium.

[14]  Mihalis Yannakakis,et al.  Towards an architecture-independent analysis of parallel algorithms , 1990, STOC '88.

[15]  Robert E. Tarjan,et al.  Finding Biconnected Components and Computing Tree Functions in Logarithmic Parallel Time (Extended Summary) , 1984, FOCS.

[16]  Alok Aggarwal,et al.  On communication latency in PRAM computations , 1989, SPAA '89.

[17]  James Christopher Wyllie,et al.  The Complexity of Parallel Computations , 1979 .

[18]  Clyde P. Kruskal,et al.  Towards a single model of efficient computation in real parallel machines , 1992, Future Gener. Comput. Syst..

[19]  Michael T. Goodrich,et al.  Communication-Efficient Parallel Sorting , 1999, SIAM J. Comput..

[20]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[21]  Xiaotie Deng,et al.  Graph Algorithms with Small Communication Costs , 2000, J. Comb. Optim..

[22]  Gary L. Miller,et al.  Deterministic Parallel List Ranking , 1988, AWOC.

[23]  Richard M. Karp,et al.  The complexity of parallel computation , 1986 .

[24]  Phillip B. Gibbons A more practical PRAM model , 1989, SPAA '89.

[25]  Clyde P. Kruskal,et al.  Towards a single model of efficient computation in real parallel machines , 1991, Future Gener. Comput. Syst..

[26]  Afonso Ferreira,et al.  Efficient Parallel Graph Algorithms For Coarse Grained Multicomputers and BSP , 1997, ICALP.

[27]  Xiaotie Deng,et al.  Graph Algorithms with Small Communication Costs , 1997, Proceedings of the Thirtieth Hawaii International Conference on System Sciences.

[28]  Yossi Matias,et al.  The QRQW PRAM: accounting for contention in parallel algorithms , 1994, SODA '94.