Chapter 3 - Parallel Affine Image Warping*

Several methods for parallel affine image warping on a linear processor array are considered. The methods were implemented on the Carnegie Mellon Warp machine and the Carnegie Mellon-Intel Corporation iWarp computer. Performance figures are provided. Methods studied included systolic, which feed one of the images in a stream; data partitioned, which partition images across the processor array; scanline-transpose, which perform affine image warps as two skew operations, and sweep-based, which move computation across the stored image. We articulate three characteristics that affect the design of parallel image warping algorithms: affine warping is easily invertible, the mapping is known at the start of execution, and nearby input pixels usually map to nearby output pixels. The methods are evaluated with respect to efficiency, capability, and memory use. Data partitioned methods are relatively easy to implement but do not scale well. The best method overall is sweep-based, which efficiently pipelines the computation and uses load balancing to achieve excellent scaling.

[1]  Shekhar Y. Borkar,et al.  iWarp: an integrated solution to high-speed parallel computing , 1988, Proceedings. SUPERCOMPUTING '88.

[2]  H. T. Kung,et al.  The Warp Computer: Architecture, Implementation, and Performance , 1987, IEEE Transactions on Computers.