Elastic Load-Balancing for Image Processing Algorithms

In this paper, we introduce a data redistribution algorithm which aims at dynamically balancing the workload of image processing algorithms on distributed memory processors. First we briefly review state-of-the-art techniques for load balancing application-specific algorithms. Then we describe the data redistribution technique, which we term “elastic load balancing” in a general framework. We demonstrate the usefulness of our redistribution strategy by comparing the efficiency obtained with and without the elastic algorithm for a thinning algorithm which aims at extracting the skeleton of a binary image. We report experimental results obtained with a Supernode machine, based upon reconfigurable networks of 32 Transputers [Nic]. We obtain a speedup of up to 28 over the sequential algorithm, using a Mandelbrot set as a test image. Note that the speedup with a static allocation of the picture was limited to 17 with the same test image, due to the load imbalance among the processors.

[1]  K. A. Teague,et al.  Parallel Thinning on a Distributed Memory Machine , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[2]  Jian Xu,et al.  Heuristic methods for dynamic load balancing in a message-passing supercomputer , 1990, Proceedings SUPERCOMPUTING '90.

[3]  Yves Robert,et al.  Image processing algorithms on distributed memory machines , 1991 .

[4]  Shahid H. Bokhari Partitioning Problems in Parallel, Pipelined, and Distributed Computing , 1988, IEEE Trans. Computers.

[5]  George Cybenko,et al.  Dynamic Load Balancing for Distributed Memory Multiprocessors , 1989, J. Parallel Distributed Comput..

[6]  Yousef Saad,et al.  Data communication in parallel architectures , 1989, Parallel Comput..

[7]  Yves Robert The Impact of Vector and Parallel Architectures on the Gaussian Elimination Algorithm , 1991 .

[8]  Oliver A. McBryan,et al.  Hypercube Algorithms and Implementations , 1985, PPSC.

[9]  P. Sadayappan,et al.  Nearest-Neighbor Mapping of Finite Element Graphs onto Processor Meshes , 1987, IEEE Transactions on Computers.

[10]  J. P. Jones,et al.  An Input/Output Algorithm for M-Dimensional Rectangular Domain Decompositions on N-Dimensional Hypercube Multicomputers , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[11]  Denis A. Nicole ESPRIT project 1085, reconfigurable transputer processor architecture , 1989 .

[12]  Zicheng Guo,et al.  Parallel thinning with two-subiteration algorithms , 1989, Commun. ACM.

[13]  Jan Olszewski,et al.  A flexible thinning algorithm allowing parallel, sequential, and distributed application , 1992, TOMS.

[14]  Lawrence Snyder,et al.  An Algorithm Producing Balanced Partitionings of Data Arrays , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[15]  Stéphane Ubéda Comparison of thinning algorithms on distributed memory machines , 1992 .

[16]  M. H. Schultz,et al.  Topological properties of hypercubes , 1988, IEEE Trans. Computers.

[17]  Robert P. Weaver,et al.  Mapping Data to Processors in Distributed Memory Computations , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[18]  Prithviraj Banerjee,et al.  Recursive Partitions On Multiprocessor , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[19]  Shahid H. Bokhari,et al.  A Partitioning Strategy for Nonuniform Problems on Multiprocessors , 1987, IEEE Transactions on Computers.