Parallel Implementation of Tree Skeletons

Trees are a useful data type, but they are not routinely included in parallel programming systems, in part because their irregular structure makes partitioning and scheduling difficult. We present a method for algebraically constructing implementations of tree skeletons, high-level homomorphic operations that execute in parallel. Many computations on binary trees can be performed inO(logn) parallel time usingnprocessors, even taking account of communication costs. We extend these results to trees with arbitrary and variable degree. Then we show that it is possible to implement a distributed version of homomorphisms on binary trees, takingO(n/p+ log2p) parallel time onp < nprocessors, for trees of any skew and taking full account of communication costs. Under slightly stronger restrictions on the underlying functions, this can be improved toO(n/p+ logp). Furthermore, the technique for deriving distributed versions is algebraic, allowing the automatic generation of code for SPMD and data-parallel architectures.

[1]  David B. Skillicorn Structured Parallel Computation in Structured Documents , 1995 .

[2]  Paul Roe Parallel programming using functional languages , 1991 .

[3]  I. Duff,et al.  Direct Methods for Sparse Matrices , 1987 .

[4]  Gary L. Miller,et al.  Parallel tree contraction and its application , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[5]  Guy E. Blelloch,et al.  NESL: A Nested Data-Parallel Language (Version 2.6) , 1993 .

[6]  David B. Skillicorn Structured Parallel Parallel Computation in Structured Documents , 1997, J. Univers. Comput. Sci..

[7]  Murray Cole,et al.  Algorithmic Skeletons: Structured Management of Parallel Computation , 1989 .

[8]  Richard Cole,et al.  Faster Optimal Parallel Prefix Sums and List Ranking , 2011, Inf. Comput..

[9]  David B. Skillicorn,et al.  Foundations of parallel programming , 1995 .

[10]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[11]  Wentong Cai,et al.  A Cost Calculus for Parallel Functional Programming , 1995, J. Parallel Distributed Comput..

[12]  Paul Roe Derivation of Eecient Data Parallel Programs , 1993 .

[13]  Kiyoshi Maruyama,et al.  The Parallel Evaluation of Arithmetic Expressions Without Division , 1973, IEEE Transactions on Computers.

[14]  Wentong Cai,et al.  Efficient Parallel Algorithms for Tree Accumulations , 1994, Sci. Comput. Program..

[15]  Ernst W. Mayr,et al.  Optimal Routing of Parentheses on the Hypercube , 1995, J. Parallel Distributed Comput..

[16]  Bidyut Baran Chaudhuri Applications of Quadtree, Octree, and Binary Tree Decomposition Techniques to Shape Analysis and Pattern Recognition , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Guy E. Blelloch,et al.  NESL: A Nested Data-Parallel Language , 1992 .

[18]  Grant Malcolm,et al.  Algebraic Data Types and Program Transformation , 1990 .

[19]  David G. Kirkpatrick,et al.  A Simple Parallel Tree Contraction Algorithm , 1989, J. Algorithms.

[20]  Wentong Cai,et al.  Equational code generation: implementing categorical data types for data parallelism , 1994, Proceedings of TENCON'94 - 1994 IEEE Region 10's 9th Annual International Conference on: 'Frontiers of Computer Technology'.