Performance engineering case study: heap construction

The behaviour of three methods for constructing a binary heap is studied. The methods considered are the original one proposed by Williams [1964], in which elements are repeatedly inserted into a single heap; the improvement by Floyd [1964], in which small heaps are repeatedly merged to bigger heaps; and a recent method proposed, e. g., by Fadel et al. [1999] in which a heap is built layerwise. Both the worst-case number of instructions and that of cache misses are analysed. It is well-known that Floyd's method has the best instruction count. Let N denote the size of the heap to be constructed, B the number of elements that fit into a cache line, and let c and d be some positive constants. Our analysis shows that, under reasonable assumptions, repeated insertion and layerwise construction both incur at most cN/B cache misses, whereas repeated merging, as programmed by Floyd, can incur more than (dN log2 B)/B cache misses. However, for a memory-tuned version of repeated merging the number of cache misses incurred is close to the optimal bound N/B.

[1]  C. A. R. Hoare,et al.  Algorithm 64: Quicksort , 1961, Commun. ACM.

[2]  C. A. R. Hoare,et al.  Algorithm 65: find , 1961, Commun. ACM.

[3]  R. W. Floyd Algorithm 245: Treesort , 1964, CACM.

[4]  M. D. MacLaren The Art of Computer Programming—Volume 1: Fundamental Algorithms (Donald E. Knuth) , 1969 .

[5]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[6]  Manuel Blum,et al.  Time Bounds for Selection , 1973, J. Comput. Syst. Sci..

[7]  M. V. Wilkes,et al.  The Art of Computer Programming, Volume 3, Sorting and Searching , 1974 .

[8]  Ronald L. Rivest,et al.  Expected time bounds for selection , 1975, Commun. ACM.

[9]  Robert Sedgewick Quicksort with Equal Keys , 1977, SIAM J. Comput..

[10]  Robert Sedgewick,et al.  Implementing Quicksort programs , 1978, CACM.

[11]  Robert E. Tarjan,et al.  Amortized efficiency of list update and paging rules , 1985, CACM.

[12]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[13]  Bruce A. Reed,et al.  Building Heaps Fast , 1989, J. Algorithms.

[14]  Henry D. Shapiro,et al.  Algorithms from P to NP , 1991 .

[15]  Colin McDiarmid,et al.  Average Case Analysis of Heap Building by Repeated Insertion , 1991, J. Algorithms.

[16]  Bjarne Stroustrup,et al.  The C++ programming language (2nd ed.) , 1991 .

[17]  Henry D. Shapiro,et al.  Algorithms from P to NP (vol. 1): design and efficiency , 1991 .

[18]  Ingo Wegener The Worst Case Complexity of McDiarmid and Reed's Variant of BOTTOM-UP HEAPSORT is less than nlog n + 1.1n , 1992, Inf. Comput..

[19]  Measuring Cache and TLB Performance and Their Effect on Benchmark Run Times USC-CS-93-546 , 1993 .

[20]  Rajeev Raman A simpler analysis of algorithm 65 (find) , 1994, SIGA.

[21]  Alan Jay Smith,et al.  Measuring Cache and TLB Performance and Their Effect on Benchmark Runtimes , 1995, IEEE Trans. Computers.

[22]  David A. Patterson,et al.  Computer architecture (2nd ed.): a quantitative approach , 1996 .

[23]  Richard E. Ladner,et al.  The influence of caches on the performance of heaps , 1996, JEAL.

[24]  Jesper Larsson Träff,et al.  A Meticulous Analysis of Mergesort Programs , 1997, CIAC.

[25]  Richard E. Ladner,et al.  The influence of caches on the performance of sorting , 1997, SODA '97.

[26]  David S. L. Wei,et al.  Computer Algorithms , 1998, Scalable Comput. Pract. Exp..

[27]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[28]  Torben Hagerup,et al.  Sorting and Searching on the Word RAM , 1998, STACS.

[29]  Jukka Teuhola,et al.  Heaps and Heapsort on Secondary Storage , 1999, Theor. Comput. Sci..

[30]  Maz Spork,et al.  Design and Analysis of Cache-Conscious Programs , 1999 .

[31]  Peter Sanders,et al.  Fast priority queues for cached memory , 1999, JEAL.

[32]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[33]  Heaps and Heapsort ∗ , .