Optimal Parallel Merging and Sorting Without Memory Conflicts

A parallel algorithm is described for merging two sorted vectors of total length N. The algorithm runs on a shared-memory model of parallel computation that disallows more than one processor to simultaneously read from or write into the same memory location. It uses k processors where l ¿ k ¿ N and requires O(N/k + log k × log N) time. The proposed approach for merging leads to a parallel sorting algorithm that sorts a vector of length N in O(log2 k + N/k) log N) time. Because they modify their behavior and hence their running time according to the number of available processors, the two new algorithms are said to be self-reconfiguring. In addition, both algorithms are optimal, for k ¿ N/log2 N, in view of the ¿(N) and ¿(N log N) lower bounds on merging and sorting, respectively.