Dynamic parallel media processing using speculative broadcast loop (SBL)

This paper presents the results of a study of dynamic parallel media processing using Speculative Broadcast Loop (SBL), a speculative run-time looplevel parallelization method. Due to processing regularity, multimedia applications typically contain extensive parallelism. Subword parallelism methods are commonly used to support data parallelism between independent loop iterations in inner loops, but much of the data parallelism in media processing resides in outer loops and cannot be supported with subword parallelism. Larger-scale parallel methods are needed to enable use of the full range of data parallelism in multimedia. Because static parallel compilation methods are often unable to recognize all parallelism at compile time, a run-time method is assumed for the speculative execution of potentially parallel loops. The SBL run-time method combines SIMD parallelism with large-scale speculation for supporting data parallelism in multimedia.

[1]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[2]  Wayne H. Wolf,et al.  Parallel media processors for the billion-transistor era , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[3]  Andrew Wolfe,et al.  Datapath design for a VLIW video signal processor , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[4]  Todd C. Mowry,et al.  The potential for using thread-level data speculation to facilitate automatic parallelization , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[5]  Wayne Wolf,et al.  Architecture and compiler design issues in programmable media processors , 2000 .

[6]  Lawrence Rauchwerger,et al.  Run-Time Parallelization: Its Time Has Come , 1998, Parallel Comput..

[7]  Kunle Olukotun,et al.  Software and Hardware for Exploiting Speculative Parallelism with a Multiprocessor , 1997 .

[8]  Lawrence Rauchwerger,et al.  Automatic Detection of Parallelism: A grand challenge for high performance computing , 1994, IEEE Parallel & Distributed Technology: Systems & Applications.

[9]  Mary W. Hall,et al.  Detecting Coarse - Grain Parallelism Using an Interprocedural Parallelizing Compiler , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[10]  Manoj Franklin,et al.  The multiscalar architecture , 1993 .

[11]  Andrew Wolfe,et al.  Available parallelism in video applications , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[12]  Rudolf Eigenmann,et al.  Automatic program parallelization , 1993, Proc. IEEE.

[13]  Wayne H. Wolf,et al.  Trace-driven studies of VLIW video signal processors , 1998, SPAA '98.

[14]  Bede Liu,et al.  Understanding multimedia application characteristics for designing programmable media processors , 1998, Electronic Imaging.