Data Allocation Strategies for Parallel Image Processing Algorithms

This paper discusses several data allocation strategies used for the parallel implementation of basic imaging operators. It shows that depending on the operator (sequential or parallel, with regular or irregular execution time), the image data must be partitioned in very different manners: The square sub-domains are best adapted for minimizing the communication volume, but rectangles can perform better when we take into account the time for constructing messages. Block allocations are well adapted for inherently parallel operators since they minimize interprocessor interactions, but in the case of recursive operators, they lead to nearly sequential executions. In this framework, we show the usefulness of block-cyclic allocations. Finally, we illustrate the fact that allocating the same amount of image data to each processor can lead to severe load imbalance in the case of some operators with data-dependant execution times.