Waste makes haste: tight bounds for loose parallel sorting

Conventional parallel sorting requires the n input keys to be output in an array of size n, and is known to take Omega (log n/log log n) time using any polynomial number of processors. The lower bound does not apply to the more 'wasteful' convention of padded sorting, which requires the keys to be output in sorted order in an array of size (1+o(1))n. The authors give very fast randomised CRCW PRAM algorithms for several padded-sorting problems. Applying only pairwise comparisons to the input and using kn processors, where 2 <or= k <or= n, they can padded-sort n keys in O(log n/log k) time with high probability (WHP), which is the best possible (expected) run time for any comparison-based algorithm. They also show how to padded-sort n independent random numbers in O(log/sub */n) time WHP with O(n) work, which matches a recent lower bound, and how to padded-sort n integers in the range 1..n in constant time whp using n processors. If the integer sorting is required to be stable, they can still solve the problem in O(log log n/log k) time WHP using kn processors, for any k with 2 <or= k <or= log n. The integer sorting results require the nonstandard OR PRAM. As an application of the padded-sorting algorithms, they can solve approximate prefix summation problems of size n with O(n) work in constant time WHP on the OR PRAM, and in O(log log n) time WHP on standard PRAM variants.<<ETX>>

[1]  Richard Cole,et al.  Parallel merge sort , 1988, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[2]  P. Ragde The parallel simplicity of compaction and chaining , 1990 .

[3]  Krzysztof Diks,et al.  Improved Deterministic Parallel Integer Sorting , 1991, Inf. Comput..

[4]  Richard Cole,et al.  Faster Optimal Parallel Prefix Sums and List Ranking , 2011, Inf. Comput..

[5]  Rajeev Raman,et al.  The Power of Collision: Randomized Parallel Algorithms for Chaining and Integer Sorting , 1990, FSTTCS.

[6]  R Sarnath Lower bounds for padded sorting and approximate countingR , 1993 .

[7]  János Komlós,et al.  An 0(n log n) sorting network , 1983, STOC.

[8]  Michael Ben-Or,et al.  A theorem on probabilistic constant depth Computations , 1984, STOC '84.

[9]  Uzi Vishkin,et al.  Converting high probability into nearly-constant time—with applications to parallel hashing , 1991, STOC '91.

[10]  Sanguthevar Rajasekaran,et al.  Optimal and Sublogarithmic Time Randomized Parallel Sorting Algorithms , 1989, SIAM J. Comput..

[11]  Torben Hagerup Hybridsort Revisited and Parallelized , 1989, Inf. Process. Lett..

[12]  Uzi Vishkin,et al.  On Parallel Hashing and Integer Sorting , 1991, J. Algorithms.

[13]  Dan E. Willard,et al.  A Density Control Algorithm for Doing Insertions and Deletions in a Sequentially Ordered File in Good Worst-Case Time , 1992, Inf. Comput..

[14]  Torben Hagerup,et al.  Fast and reliable parallel hashing , 1991, SPAA '91.

[15]  Alon Itai,et al.  A Sparse Table Implementation of Priority Queues , 1981, ICALP.

[16]  Ravi B. Boppana The Average-Case Parallel Complexity of Sorting , 1989, Inf. Process. Lett..

[17]  Torben Hagerup Constant-time parallel integer sorting , 1991, STOC '91.

[18]  Uzi Vishkin,et al.  Recursive *-tree parallel data-structure , 1989, 30th Annual Symposium on Foundations of Computer Science.

[19]  Leslie G. Valiant,et al.  A logarithmic time sort for linear size networks , 1982, STOC.

[20]  Johan Håstad,et al.  Optimal bounds for decision problems on the CRCW PRAM , 1987, STOC.

[21]  Martin Dietzfelbinger,et al.  A Perfect Parallel Dictionary , 1992, MFCS.

[22]  Torben Hagerup,et al.  The Log-Star Revolution , 1992, STACS.

[23]  Torben Hagerup,et al.  Towards Optimal Parallel Bucket Sorting , 1987, Inf. Comput..

[24]  Johan Håstad,et al.  Almost optimal lower bounds for small depth circuits , 1986, STOC '86.

[25]  M. Donald MacLaren Internal Sorting by Radix Plus Sifting , 1966, JACM.

[26]  Bogdan S. Chlebus Parallel Iterated Bucket Sort , 1989, Inf. Process. Lett..

[27]  Quentin F. Stout,et al.  Ultra-fast expected time parallel algorithms , 1991, SODA '91.

[28]  Michael T. Goodrich Using approximation algorithms to design parallel algorithms that may ignore processor allocation , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[29]  Luc Devroye,et al.  Lecture Notes on Bucket Algorithms , 1986, Progress in Computer Science.

[30]  Noga Alon,et al.  The Average Complexity of Deterministic and Randomized Parallel Comparison-Sorting Algorithms , 1988, SIAM J. Comput..

[31]  Yossi Azar,et al.  Tight Comparison Bounds on the Complexity of Parallel Sorting , 2018, SIAM J. Comput..

[32]  Philip D. MacKenzie,et al.  Load balancing requires Ω(log * n ) expected time , 1992, SODA 1992.

[33]  Uzi Vishkin,et al.  Towards a theory of nearly constant time parallel algorithms , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.