论文信息 - Lazy parallelization: a finite state machine based optimization approach for data parallel image processing applications

Lazy parallelization: a finite state machine based optimization approach for data parallel image processing applications

Performance obtained with existing library-based parallelization tools for implementing high performance image processing applications is often sub-optimal. This is because inter-operation optimization (or: optimization across library calls) is often not incorporated in the library implementations. This paper presents a simple, efficient, finite state machine-based method for global performance optimization, called 'lazy parallelization'. Experimental results based on this approach show significant performance improvements over non-optimized parallel implementations.

Dennis Koelma | Frank J. Seinstra | D. Koelma | F. Seinstra

[1] Mounir Hamdi,et al. Parallel Image Processing Applications on a Network of Workstations , 1995, Parallel Comput..

[2] Cristina Nicolescu,et al. A Data and Task Parallel Image Processing Environment , 2001, PVM/MPI.

[3] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[4] Frank J. Seinstra,et al. User Transparent Parallel Image Processing , 2003 .

[5] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.

[7] Michael W. Berry,et al. Parallelization of the Hoshen-Kopelman Algorithm Using a Finite State Machine , 1997, Int. J. High Perform. Comput. Appl..

[8] 守屋悦朗,et al. J.E.Hopcroft, J.D. Ullman 著, "Introduction to Automata Theory, Languages, and Computation", Addison-Wesley, A5変形版, X+418, \6,670, 1979 , 1980 .

[9] Dennis Koelma,et al. P-3PC: A Point-to-Point Communication Model for Automatic and Optimal Decomposition of Regular Domain Problems , 2002, IEEE Trans. Parallel Distributed Syst..

[10] Cristina Nicolescu,et al. A data and task parallel image processing environment , 2002, Parallel Comput..

[11] José M. F. Moura,et al. Fast Automatic Generation of DSP Algorithms , 2001, International Conference on Computational Science.

[12] John R. Gilbert,et al. Generating local addresses and communication sets for data-parallel programs , 1993, PPOPP '93.

[13] Juan Li,et al. A software environment for parallel computer vision , 1992, Computer.

[14] Rin-ichiro Taniguchi,et al. Software platform for parallel image processing and computer vision , 1997, Optics & Photonics.

[15] Dennis Koelma,et al. Software architecture for application-driven high-performance image processing , 1997, Optics & Photonics.

[16] D UllmanJeffrey,et al. Introduction to automata theory, languages, and computation, 2nd edition , 2001 .

[17] Peter M. Maurer. Logic simulation using networks of state machines , 2000, DATE '00.

[18] Dennis Koelma,et al. A software architecture for user transparent parallel image processing , 2002, Parallel Comput..

[19] Dennis Koelma,et al. User transparency: a fully sequential programming model for efficient data parallel image processing , 2004, Concurr. Pract. Exp..

[20] Danny Crookes,et al. A PVM Implementation of a Portable Parallel Image Processing Library , 1996, PVM.

[21] Robert L. Stevenson,et al. Toolkit for parallel image processing , 1998, Optics & Photonics.

[22] Manuela M. Veloso,et al. Learning to Generate Fast Signal Processing Implementations , 2001, ICML.

[23] P. P. Jonkerb,et al. A Software Architecture for Application Driven High Performance Image Processing , 1997 .

[24] Jeffrey D. Ullman,et al. Introduction to Automata Theory, Languages and Computation , 1979 .

[25] Message Passing Interface Forum. MPI: A message - passing interface standard , 1994 .

[26] Zoran Jovanovic,et al. A finite state machine based format model of software pipelined loops with conditions , 2001 .

[27] Message P Forum,et al. MPI: A Message-Passing Interface Standard , 1994 .

[28] David E. Bernholdt,et al. A performance optimization framework for compilation of tensor contraction expressions into parallel , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.