Optimal Mappings of q-ary and Binomial Trees into Parallel Memory Modules for Fast and Conflict-Free Access to Path and Subtree Templates

The main memory access latency can significantly slow down the overall performance of a computer system due to the fact that average cycle time of the main memory is typically a factor of 5?10 times higher than that of a processor. To cope with this problem, in addition to the use of caches, the main memory of a multiprocessor architecture is usually organized into multiple modules or banks. Although such organization enhances memory bandwidth, the amount of data that the multiprocessor can retrieve in the same memory cycle, conflicts due to simultaneous attempts to access the same memory module may reduce the effective bandwidth. Therefore, efficient mapping schemes are required to distribute data in such a way that regular patterns, called templates, of various structures can be retrieved in parallel without memory conflicts. Prior work on data mappings mostly dealt with conflict-free access to templates such as rows, columns, or diagonals of (multidimensional) arrays, and only limited attention has been paid to access templates of nonnumeric structures such as trees. In this paper, we study optimal and balanced mappings for accessing path and subtree templates of trees, where a mapping will be called optimal if it allows conflict-free access to templates with as few memory banks as possible. An optimal mapping will also be called balanced if it distributes as evenly as possible the nodes of the entire tree among the memory banks available. In particular, based on Latinsquares, we propose an optimal and balanced mapping for leaf-to-root paths of q-ary trees. Another (recursive) mapping for leaf-to-root paths of binary trees raises interesting combinatorial problems. We also derive an optimal and balanced mapping to access complete t-ary subtrees of complete q-ary trees, where 2?t?q, and an optimal mapping for subtrees of binomial trees.

[1]  Sajal K. Das,et al.  Distributed priority queues on hypercube architectures , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[2]  Sajal K. Das,et al.  O(log log N) Time Algorithms for Hamiltonian Suffix and Min-Max-Pair Heap Operations on the Hypercube , 1998, J. Parallel Distributed Comput..

[3]  Sajal K. Das,et al.  Conflict-Free Access to Templates of Trees and Hypercubes in Parallel Memory Systems , 1997, COCOON.

[4]  Alan A. Bertossi,et al.  Mappings for Conflict-Free Access of Paths in Bidimensional Arrays, Circular Lists, and Complete Trees , 2002, J. Parallel Distributed Comput..

[5]  Sajal K. Das,et al.  Optimal and Load Balanced Mapping of Parallel Priority Queues in Hypercubes , 1996, IEEE Trans. Parallel Distributed Syst..

[6]  Guy E. Blelloch,et al.  Accounting for memory bank contention and delay in high-bandwidth multiprocessors , 1995, SPAA '95.

[7]  J. Dénes,et al.  Latin squares and their applications , 1974 .

[8]  Cauligi S. Raghavendra,et al.  On Array Storage for Conflict-Free Memory Access for Parallel Processors , 1988, International Conference on Parallel Processing.

[9]  Viktor K. Prasanna,et al.  Latin Squares for Parallel Array Access , 1993, IEEE Trans. Parallel Distributed Syst..

[10]  Paul Budnik,et al.  The Organization and Use of Parallel Memories , 1971, IEEE Transactions on Computers.

[11]  Sajal K. Das,et al.  Toward a universal mapping algorithm for accessing trees in parallel memory systems , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[12]  Sajal K. Das,et al.  Conflict-free data access of arrays and trees in parallel memory systems , 1994, Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing.

[13]  Sajal K. Das,et al.  Load Balanced Mapping of Data Structures in Parallel Memory Modules for Fast and Conflict-Free Templates Access , 1997, WADS.

[14]  Sajal K. Das,et al.  Conflict-free template access in k-ary and binomial trees , 1997, ICS '97.

[15]  Sajal K. Das,et al.  Conflict-Free Path Access of Trees in Parallel Memory Systems with Application to Distributed Heap Implementation , 1995, ICPP.

[16]  S. Lennart Johnsson,et al.  Optimum Broadcasting and Personalized Communication in Hypercubes , 1989, IEEE Trans. Computers.

[17]  Michael Gössel,et al.  Memories for Parallel Subtree-Access , 1987, Parallel Algorithms and Architectures.

[18]  Kurt Mehlhorn,et al.  Algorithms and data structures , 1984 .