A Novel In-Place Sorting Algorithm with O(n log z) Comparisons and O(n log z) Moves

In-place sorting algorithms play an important role in many fields such as very large database systems, data warehouses, data mining, etc. Such algorithms maximize the size of data that can be processed in main memory without input/output operations. In this paper, a novel in-place sorting algorithm is presented. The algorithm comprises two phases; rearranging the input unsorted array in place, resulting segments that are ordered relative to each other but whose elements are yet to be sorted. The first phase requires linear time, while, in the second phase, elements of each segment are sorted inplace in the order of z log (z), where z is the size of the segment, and O(1) auxiliary storage. The algorithm performs, in the worst case, for an array of size n, an O(n log z) element comparisons and O(n log z) element moves. Further, no auxiliary arithmetic operations with indices are required. Besides these theoretical achievements of this algorithm, it is of practical interest, because of its simplicity. Experimental results also show that it outperforms other in-place sorting algorithms. Finally, the analysis of time and space complexity, and required number of moves are presented, along with the auxiliary storage requirements of the proposed algorithm. Keywords—Auxiliary storage sorting, in-place sorting, sorting.

[1]  Martin L. Kersten,et al.  Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.

[2]  Ramesh C. Agarwal,et al.  A super scalar sort algorithm for RISC processors , 1996, SIGMOD '96.

[3]  Robert Sedgewick,et al.  Fast algorithms for sorting and searching strings , 1997, SODA '97.

[4]  Abhinandan Das,et al.  Approximate join processing over data streams , 2003, SIGMOD '03.

[5]  M. V. Wilkes,et al.  The Art of Computer Programming, Volume 3, Sorting and Searching , 1974 .

[6]  Michael A. Bender,et al.  Cache-oblivious B-trees , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[7]  Venansius Baryamureeba,et al.  PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 8 , 2005 .

[8]  Gianni Franceschini,et al.  An in-place sorting with O(n log n) comparisons and O(n) moves , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[9]  Jan van Lunteren Searching very large routing tables in wide embedded memory , 2001, GLOBECOM.

[10]  Andrei Z. Broder,et al.  Using multiple hash functions to improve IP lookups , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[11]  Martin L. Kersten,et al.  What Happens During a Join? Dissecting CPU and Memory Optimization Effects , 2000, VLDB.

[12]  Gurmeet Singh Manku,et al.  Approximate counts and quantiles over sliding windows , 2004, PODS.

[13]  Gianni Franceschini,et al.  Sorting Stably, In-Place, with O(n log n) Comparisons and O(n) Moves , 2005, STACS.

[14]  Divyakant Agrawal,et al.  Hardware Acceleration in Commercial Databases: A Case Study of Spatial Operations , 2004, VLDB.

[15]  Richard E. Ladner,et al.  The influence of caches on the performance of sorting , 1997, SODA '97.

[16]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[17]  Abhinandan Das,et al.  Efficient Approximation of Correlated Sums on Data Streams , 2003, IEEE Trans. Knowl. Data Eng..

[18]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[19]  Richard E. Ladner,et al.  The influence of caches on the performance of heaps , 1996, JEAL.

[20]  Eli Upfal,et al.  Balanced Allocations , 1999, SIAM J. Comput..

[21]  J. Vitter,et al.  On Sorting Strings in External Memory , 1997 .

[22]  Arne Andersson,et al.  Tight Bounds for Searching a Sorted Array of Strings , 2000, SIAM J. Comput..

[23]  Roberto Grossi,et al.  On sorting strings in external memory (extended abstract) , 1997, STOC '97.

[24]  Gianni Franceschini Proximity Mergesort: optimal in-place sorting in the cache-oblivious model , 2004, SODA '04.