A Simplified Design Strategy for Mapping Image Processing Algorithms on a SIMD Torus

Abstract It is proposed to enhance and simplify the programming of a two dimensional (2-D) torus (and mesh) connected SIMD array of simple processing elements (PEs) by introducing two dedicated communication registers in each PE. A new SIMD algorithm to transpose a matrix using only two buffers at each PE is described. A method is proposed to effectively realize large number of arbitrary, one-to-one, personalized, and concurrent communication between the PEs, by suitably repeating the matrix transpose algorithm. Implementation of several image processing tasks of shift-variant nature, such as hough transform, histogram, median filters, which involve such communication, is enhanced by this approach. The dynamic behavior of such a SIMD implementation is data independent, unlike the ones that employ greedy methods for handling the overall communication. This feature facilitates coordinated use of several independently operating SIMD meshes in a newly emerging computer vision paradigm known as multiview image-sequence analysis (MVISA) for 3-D perception of unstructured dynamic scenes.

[1]  H. T. Kung Why systolic architectures? , 1982, Computer.

[2]  Sun-Yuan Kung,et al.  On supercomputing with systolic/wavefront array processors , 1984 .

[3]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[4]  F. Leighton,et al.  Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .

[5]  Virginio Cantoni,et al.  Multiprocessor computing for images , 1988, Proc. IEEE.

[6]  Takeo Kanade,et al.  Integrated sensor and range-finding analog signal processor , 1991 .

[7]  K. Mani Chandy Parallel program design , 1989 .

[8]  S. Kung,et al.  VLSI Array processors , 1985, IEEE ASSP Magazine.

[9]  Viktor K. Prasanna,et al.  Orthogonal multiprocessor sharing memory with an enhanced mesh for integrated image understanding , 1991, CVGIP Image Underst..

[10]  Graham R. Nudd,et al.  A Cellular VLSI Architecture , 1984, Computer.

[11]  Howard Jay Siegel,et al.  PASM: A Partitionable SIMD/MIMD System for Image Processing and Pattern Recognition , 1981, IEEE Transactions on Computers.

[12]  M. A. Eshera,et al.  Parallel rule-based fuzzy inference on mesh-connected systolic arrays , 1989, IEEE Expert.

[13]  G. Stewart Introduction to matrix computations , 1973 .

[14]  Mohan Kumar,et al.  Extended Hypercube: A Hierarchical Interconnection Network of Hypercubes , 1992, IEEE Trans. Parallel Distributed Syst..

[15]  Takeo Kanade,et al.  A VISI Smart Sensor For Fast Range Imaging , 1992, Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  T. Kailath,et al.  VLSI and Modern Signal Processing , 1984 .

[17]  Quentin F. Stout,et al.  Mapping vision algorithms to parallel architectures , 1988, Proc. IEEE.

[18]  Viktor K. Prasanna,et al.  Parallel Architectures and Algorithms for Image Component Labeling , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Sun-Yuan Kung,et al.  A highly concurrent algorithm and pipeleined architecture for solving Toeplitz systems , 1983 .

[20]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .