Systolic Opportunities for Multidimensional Data Streams

Portable image processing applications require an efficient, scalable platform with localized computing regions. This paper presents a new class of area I/O systolic architecture to exploit the physical data locality of planar data streams by processing data where it falls. A synthesis technique using dependence graphs, data partitioning, and computation mapping is developed to handle planar data streams and to systematically design arrays with area I/O. Simulation results show that the use of area I/O provides a 16 times speedup over systems with perimeter I/O. Performance comparisons for a set of signal processing algorithms show that systolic arrays that consider planar data streams in the design process are up to three times faster than traditional arrays.

[1]  Hyesook Lim,et al.  A systolic array for 2-D DFT and 2-D DCT , 1994, Proceedings of IEEE International Conference on Application Specific Array Processors (ASSAP'94).

[2]  Thomas Kailath,et al.  Derivation, extensions and parallel implementation of regular iterative algorithms , 1989 .

[3]  H. T. Kung Why systolic architectures? , 1982, Computer.

[4]  Ping-Sheng Tseng A Systolic Array Parallelizing Compiler , 1990, J. Parallel Distributed Comput..

[5]  Thomas Kailath,et al.  Regular iterative algorithms and their implementation on processor arrays , 1988, Proc. IEEE.

[6]  Tarek M. Taha,et al.  Processing architectures for smart pixel systems , 1996 .

[7]  Benjamin W. Wah,et al.  Systematic approaches to the design of algorithmically specified systolic arrays , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Magdy A. Bayoumi,et al.  Systematic Algorithm Mapping for Multidimensional Systolic Arrays , 1989, J. Parallel Distributed Comput..

[9]  Eric R. Fossum,et al.  Real-Time Focal-Plane Array Image Processor , 1990, Other Conferences.

[10]  Michael Scheffler,et al.  Area I/O's potential for future processor systems , 1998, IEEE Micro.

[11]  Dilip Sarkar Cost and Time-Cost Effectiveness of Multiprocessing , 1993, IEEE Trans. Parallel Distributed Syst..

[12]  Yu Hen Hu,et al.  On systolic mapping of multi-stage algorithms , 1992, [1992] Proceedings of the International Conference on Application Specific Array Processors.

[13]  Yu Hen Hu,et al.  MSSM—A design aid for multi-stage systolic mapping , 1992, J. VLSI Signal Process..

[14]  P. Quinton Automatic synthesis of systolic arrays from uniform recurrent equations , 1984, ISCA 1984.

[15]  Patrice Quinton Automatic synthesis of systolic arrays from uniform recurrent equations , 1984, ISCA '84.

[16]  V. Leitáo,et al.  Computer Graphics: Principles and Practice , 1995 .

[17]  Pradeep K. Dubey,et al.  How Multimedia Workloads Will Change Processor Design , 1997, Computer.

[18]  Arnold L. Rosenberg Three-Dimensional Integrated Circuitry , 1981 .

[19]  Magdy Bayoumi,et al.  The design and implementation of multidimensional systolic arrays for DSP applications , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[20]  William J. Dally,et al.  VLSI architecture: past, present, and future , 1999, Proceedings 20th Anniversary Conference on Advanced Research in VLSI.

[21]  H. T. Kung,et al.  Systolic Arrays for (VLSI). , 1978 .

[22]  Richard M. Karp,et al.  The Organization of Computations for Uniform Recurrence Equations , 1967, JACM.

[23]  Dan I. Moldovan,et al.  Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays , 1986, IEEE Transactions on Computers.

[24]  T. Kailath,et al.  Array architectures for iterative algorithms , 1987, Proceedings of the IEEE.

[25]  D. Scott Wills,et al.  Real time image processing on parallel arrays for gigascale integration , 1999 .

[26]  H. B. Bakoglu,et al.  Circuits, interconnections, and packaging for VLSI , 1990 .

[27]  Emina I. Milovanovic,et al.  The Design of Optimal Planar Systolic Arrays for Matrix Multiplication , 1997 .

[28]  Eric R. Fossum,et al.  Digital camera system on a chip , 1998, IEEE Micro.

[29]  Yi Wang,et al.  2D matrix multiplication on a 3D systolic array , 1996 .

[30]  Sanjay V. Rajopadhye,et al.  Synthesizing systolic arrays from recurrence equations , 1990, Parallel Comput..

[31]  Mary Jane Irwin,et al.  VLSI architectures for the discrete wavelet transform , 1995 .

[32]  Joel H. Saltz,et al.  A Scheme for Supporting Automatic Data Migration on Multlcomputers , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[33]  Frederick C. Hennie Iterative Arrays of Logical Circuits , 2003 .

[34]  Don W. Lake CMOS Image Capture for Digital Stills Cameras , 1998, PICS.

[35]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[36]  H. T. Kung,et al.  A Two-Level Pipelined Systolic Array for Convolutions , 1981 .

[37]  James D. Meindl Gigascale integration: is the sky the limit? , 1996 .

[38]  Monica Sin-Ling Lam,et al.  A Systolic Array Optimizing Compiler , 1989 .

[39]  Veljko Milutinovic,et al.  3D convolution on a 3D systolic array : Another point of view , 1997 .

[40]  Daniel P. Lopresti,et al.  Architecture of a programmable systolic array , 1988, [1988] Proceedings. International Conference on Systolic Arrays.

[41]  E. J. Rymaszewski,et al.  Microelectronics Packaging Handbook , 1988 .

[42]  Doug Matzke,et al.  Will Physical Scalability Sabotage Performance Gains? , 1997, Computer.

[43]  Doran Wilde,et al.  Regular array synthesis using ALPHA , 1994, Proceedings of IEEE International Conference on Application Specific Array Processors (ASSAP'94).

[44]  S. H. Unger,et al.  A Computer Oriented toward Spatial Problems , 1958 .