Cost Optimality and Predictability of Parallel Programming with Skeletons

Skeletons are reusable, parameterized components with well-defined semantics and pre-packaged efficient parallel implementation. This paper develops a new, provably cost-optimal implementation of the DS (double-scan) skeleton for the divide-and-conquer paradigm. Our implementation is based on a novel data structure called plist (pointed list); implementation’s performance is estimated using an analytical model. We demonstrate the use of the DS skeleton for parallelizing a tridiagonal system solver and report experimental results for its MPI implementation on a Cray T3E and a Linux cluster: they confirm the performance improvement achieved by the cost-optimal implementation and demonstrate its good predictability by our performance model.

[1]  Sergei Gorlatch,et al.  Double-Scan: Introducing and Implementing a New Data-Parallel Skeleton , 2002, Euro-Par.

[2]  Sergei Gorlatch,et al.  Abstraction and performance in the design of parallel programs: an overview of the SAT approach , 2000, Acta Informatica.

[3]  Christian Lengauer,et al.  HDC: A Higher-Order Language for Divide-and-Conquer , 2000, Parallel Process. Lett..

[4]  F. Thomson Leighton,et al.  ARRAYS AND TREES , 1992 .

[5]  Juan López,et al.  Unified Architecture for Divide and Conquer Based Tridiagonal System Solvers , 1994, IEEE Trans. Computers.

[6]  Xiaojing Wang,et al.  A divide-and-conquer method of solving tridiagonal systems on hypercube massively parallel computers , 1991, Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing.

[7]  Sergei Gorlatch,et al.  Systematic Efficient Parallelization of Scan and Other List Homomorphisms , 1996, Euro-Par, Vol. II.

[8]  José C. Cunha Future Generations of Problem-Solving Environments , 2000, The Architecture of Scientific Software.

[9]  Marco Danelutto,et al.  SKElib : Parallel Programming with Skeletons in C , 2000, Euro-Par.

[10]  Murray Cole,et al.  Algorithmic skeletons : a structured approach to the management of parallel computation , 1988 .

[11]  Marco Danelutto,et al.  Skeletons for Data Parallelism in p3l , 1997, Euro-Par.

[12]  F. Leighton,et al.  Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .

[13]  George Horatiu Botorog,et al.  Efficient Parallel Programming with Algorithmic Skeletons , 1996, Euro-Par, Vol. I.

[14]  H. H. Wang,et al.  A Parallel Method for Tridiagonal Equations , 1981, TOMS.