The implementation of P3/I, a parallel architecture for video real-time processing: a case study

This paper provides a tutorial on the motivations, design, and applications of parallel processing applied to video real-time, illustrated by the experience gained in the implementation of the P/sup 3/I machine. Its main purpose is to highlight the motivations for such a development the basic implementation choices, the major difficulties encountered and how they have been solved. Through these studies we found that parallel processing is well-suited to video real-time, when programmable implementations are considered. There are many outcomes of the P/sup 3/I project, ranging from architectural considerations to parallel algorithms optimizations, and programming methodology. We want to emphasize three conclusions. First, programming an architecture composed of different parallel paradigms in a given architecture is tractable, and this heterogeneity is cost effective and efficient in terms of processing performances. Second, concerning the well known debate about how to match parallel architectures and image processing "levels" we conclude that the key is not to discuss Flynn's taxonomy (i.e., data versus tasks parallelism) but to consider how the parallelism grain evolves within a whole application. Third, we confirm that in the field of image processing, the efficiency of parallelism can only be gained if algorithms developers think "parallel"; this result seems to be obvious, but just consider the trends of recent RISC processors, embedding more and more parallelism, and claiming at a compatibility with existing sequential softwares.

[1]  Jon A. Webb,et al.  High performance computing in image processing and computer vision , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 2 - Conference B: Computer Vision & Image Processing. (Cat. No.94CH3440-5).

[2]  Peter M. Flanders A Unified Approach to a Class of Data Movements on an Array Processor , 1982, IEEE Transactions on Computers.

[3]  Petrus Paulus Jonker Morphological Image Processing: Architecture and VLSI design , 1993 .

[4]  Jean Vuillemin,et al.  Introduction to programmable active memories , 1990 .

[5]  Pieter P. Jonker Why linear arrays are better image processors , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 2 - Conference B: Computer Vision & Image Processing. (Cat. No.94CH3440-5).

[6]  Edward W. Davis,et al.  BLITZEN: a highly integrated massively parallel machine , 1988, Proceedings., 2nd Symposium on the Frontiers of Massively Parallel Computation.

[7]  Albert Benveniste,et al.  The synchronous approach to reactive and real-time systems , 1991 .

[8]  Massimo Maresca,et al.  Polymorphic Processor Arrays , 1993, IEEE Trans. Parallel Distributed Syst..

[9]  Guy E. Blelloch,et al.  Vcode: a data-parallel intermediate language , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.

[10]  B. Zavidovique,et al.  A data-flow processor for real-time low-level image processing , 1991, Proceedings of the IEEE 1991 Custom Integrated Circuits Conference.

[11]  Martin C. Herbordt,et al.  Practical Algorithms for Online Routing on Fixed and Reconfigurable Meshes , 1994, J. Parallel Distributed Comput..

[12]  David N. Chin,et al.  The Princeton Engine: a real-time video system simulator , 1988 .

[13]  Charles C. Weems Architectural requirements of image understanding with respect to parallel processing , 1991 .

[14]  D. J. Hunt AMT DAP—a processor array in a workstation environment , 1989 .

[15]  Virginio Cantoni,et al.  PAPIA: a case history , 1987 .

[16]  G. L. Steele Common Lisp , 1990 .

[17]  Kurt Akeley,et al.  Reality Engine graphics , 1993, SIGGRAPH.

[18]  Guy E. Blelloch,et al.  Collection-oriented languages , 1991 .

[19]  Nobuyuki Yagi,et al.  A Programmable Real-Time Video Signal-Processing System , 1991 .

[20]  Guy L. Steele,et al.  Common Lisp the Language , 1984 .

[21]  John R. Nickolls,et al.  The design of the MasPar MP-1: a cost effective massively parallel computer , 1990, Digest of Papers Compcon Spring '90. Thirty-Fifth IEEE Computer Society International Conference on Intellectual Leverage.

[22]  Edward A. Lee,et al.  Scheduling dynamic dataflow graphs with bounded memory using the token flow model , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.