Hierarchical memory with block transfer

In this paper we introduce a model of Hierarchical Memory with Block Transfer (BT for short). It is like a random access machine, except that access to location x takes time f(x), and a block of consecutive locations can be copied from memory to memory, taking one unit of time per element after the initial access time. We first study the model with f(x) = xα for 0 ≪ α ≪ 1. A tight bound of θ(n log log n) is shown for many simple problems: reading each input, dot product, shuffle exchange, and merging two sorted lists. The same bound holds for transposing a √n × √n matrix; we use this to compute an FFT graph in optimal θ(n log n) time. An optimal θ(n log n) sorting algorithm is also shown. Some additional issues considered are: maintaining data structures such as dictionaries, DAG simulation, and connections with PRAMs. Next we study the model f(x) = x. Using techniques similar to those developed for the previous model, we show tight bounds of θ(n log n) for the simple problems mentioned above, and provide a new technique that yields optimal lower bounds of Ω(n log2n) for sorting, computing an FFT graph, and for matrix transposition. We also obtain optimal bounds for the model f(x)= xα with α ≫ 1. Finally, we study the model f(x) = log x and obtain optimal bounds of θ(n log*n) for simple problems mentioned above and of θ(n log n) for sorting, computing an FFT graph, and for some permutations.

[1]  [93] Tarjan, R. E. (1983). Data Structures and Network Algorithms. SIAM. , .

[2]  Lynn Conway,et al.  Introduction to VLSI systems , 1978 .

[3]  Frank Thomson Leighton A layout strategy for VLSI which is provably good (Extended Abstract) , 1982, STOC '82.

[4]  Gabriel M. Silberman Delayed-Staging Hierarchy Optimization , 1983, IEEE Transactions on Computers.

[5]  Robert W. Floyd,et al.  Permuting Information in Idealized Two-Level Storage , 1972, Complexity of Computer Computations.

[6]  Alfred V. Aho,et al.  Data Structures and Algorithms , 1983 .

[7]  Bowen Alpern,et al.  A model for hierarchical memory , 1987, STOC.

[8]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[9]  Alok Aggarwal,et al.  The I/O Complexity of Sorting and Related Problems (Extended Abstract) , 1987, ICALP.

[10]  Ramesh C. Agarwal,et al.  Fourier Transform and Convolution Subroutines for the IBM 3090 Vector Facility , 1986, IBM J. Res. Dev..

[11]  Arnold Schönhage A nonlinear lower bound for random-access machines under logarithmic cost , 1988, JACM.

[12]  Irving L. Traiger,et al.  Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..

[13]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[14]  Jean-Loup Baer,et al.  Computer systems architecture , 1980 .

[15]  Jan Gecsei Determining Hit Ratios for Multilevel Hierarchies , 1974, IBM J. Res. Dev..

[16]  Patrick C. Fischer,et al.  Storage reorganization techniques for matrix computation in a paging environment , 1979, CACM.

[17]  Alan Jay Smith,et al.  Bibliography and reading on CPU cache memories and related topics , 1986, CARN.

[18]  Edward G. Coffman,et al.  Organizing matrices and matrix operations for paged memory systems , 1969, Commun. ACM.

[19]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.

[20]  Jon Louis Bentley,et al.  Decomposable Searching Problems I: Static-to-Dynamic Transformation , 1980, J. Algorithms.

[21]  Chak-Kuen Wong,et al.  Algorithmic Studies in Mass Storage Systems , 1983, Springer Berlin Heidelberg.

[22]  Uzi Vishkin,et al.  An O(log n) Parallel Connectivity Algorithm , 1982, J. Algorithms.

[23]  H. T. Kung,et al.  I/O complexity: The red-blue pebble game , 1981, STOC '81.

[24]  Peter J. Denning,et al.  Virtual memory , 1970, CSUR.