An Efficient Implementation of Nested Data Parallelism for Irregular Divide-and-Conquer Algorithms

This paper presents work in progress on a new method of implementing irregular divide-and-conquer algorithms in a nested data-parallel language model on distributedmemory multiprocessors. The main features discussed are the recursive subdivision of asynchronous processor groups to match the change from data-parallel to control-parallel behavior over the lifetime of an algorithm, switching from parallel code to serial code when the group size is one (with the opportunityto use a more efficient serial algorithm) , and a simple manager-based run-time load-balancing system. Sample algorithms translated from the high-level nested data-parallel language NESL into C and MPI using this method are significantly faster than the current NESL system, and show the potential for further speedup.

[1]  Siddhartha Chatterjee,et al.  Compiling data-parallel programs for efficient execution on shared-memory multiprocessors , 1992 .

[2]  Karen H. Warren,et al.  The Parallel C Preprocessor , 1992, Sci. Program..

[3]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[4]  Guy E. Blelloch,et al.  Vcode: a data-parallel intermediate language , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.

[5]  Guy E. Blelloch,et al.  Implementation of a portable nested data-parallel language , 1993, PPOPP '93.

[6]  Jayadev Misra,et al.  Powerlist: a structure for parallel recursion , 1994, TOPL.

[7]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[8]  Siddhartha Chatterjee,et al.  An object-oriented approach to nested data parallelism , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.

[9]  J. C. Hardwick,et al.  Porting a vector library: a comparison of MPI, Paris, CMMD and PVM , 1994, Proceedings Scalable Parallel Libraries Conference.

[10]  Guy E. Blelloch,et al.  Vector Models for Data-Parallel Computing , 1990 .

[11]  M.M.T. Chakravarty,et al.  V-nested parallelism in C , 1995, Programming Models for Massively Parallel Computers.

[12]  James Demmel,et al.  Modeling the benefits of mixed data and task parallelism , 1995, SPAA '95.

[13]  Guy E. Blelloch,et al.  NESL User's Manual (for NESL Version 3.1). , 1995 .

[14]  Guy E. Blelloch,et al.  Collection-oriented languages , 1991 .

[15]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[16]  S.T. Barnard,et al.  PMRSB: Parallel Multilevel Recursive Spectral Bisection , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[17]  Shang-Hua Teng,et al.  How Good is Recursive Bisection? , 1997, SIAM J. Sci. Comput..

[18]  Xingbin Zhang,et al.  A Hybrid Execution Model for Fine-Grained Languages on Distributed Memory Multicomputers , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[19]  John H. Reif,et al.  Prototyping parallel and distributed programs in Proteus , 1991, Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing.