Improving the performance of a thread-level speculation library

Speculative parallelization is a technique that tries to extract parallelism of loops that can not be parallelized at compile time. The underlying idea is to optimistically execute the code in parallel, while a subsystem checks that sequential semantics have not been violated. There exist many proposals in this field, however, to the best of our knowledge, there are not any solution that allows to effectively parallelize those applications that use pointer arithmetic. In a previous work, the authors of this paper presented a software library that allow the parallelization of this kind of applications. Nevertheless, the software developed had an important limitation: Execution time of the parallelized versions was higher than the sequential one. In this work, this limitation has been addressed, finding and solving the reasons of this lack of efficiency. Experimental results obtained allow us to affirm that these limitations have been overcome.

[1]  Wei Liu,et al.  Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation , 2005, ICS '05.

[2]  Diego R. Llanos Ferraris,et al.  Design space exploration of a software speculative parallelization scheme , 2005, IEEE Transactions on Parallel and Distributed Systems.

[3]  Antonia Zhai,et al.  A scalable approach to thread-level speculation , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[4]  Xiaomei Li,et al.  A Priority-Aware NoC to Reduce Squashes in Thread Level Speculation for Chip Multiprocessors , 2011, 2011 IEEE Ninth International Symposium on Parallel and Distributed Processing with Applications.

[5]  Lawrence Rauchwerger,et al.  The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization , 1995, PLDI '95.

[6]  Michael C. Huang,et al.  Speculative Parallelization in Decoupled Look-ahead , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[7]  Álvaro Estébanez López Desarrollo de un motor de paralelización especulativa con soporte para aritmética de punteros , 2012 .

[8]  Mohammed Al-Rawi,et al.  Fast Zernike moments , 2008, Journal of Real-Time Image Processing.

[9]  Manish Gupta,et al.  Techniques for Speculative Run-Time Parallelization of Loops , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[10]  Whoi-Yul Kim,et al.  A novel approach to the fast computation of Zernike moments , 2006, Pattern Recognit..

[11]  Channoh Kim,et al.  Practical speculative parallelization of variable-length decompression algorithms , 2013, LCTES '13.

[12]  Shen Li,et al.  HVD-TLS: A Novel Framework of Thread Level Speculation , 2012, 2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications.

[13]  Chee-Way Chong,et al.  A comparative analysis of algorithms for fast computation of Zernike moments , 2003, Pattern Recognit..

[14]  Rajiv Gupta,et al.  Enhanced speculative parallelization via incremental recovery , 2011, PPoPP '11.

[15]  Cho-Li Wang,et al.  GPU-TLS: An Efficient Runtime for Speculative Loop Parallelization on GPUs , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[16]  Rajiv Gupta,et al.  Supporting speculative parallelization in the presence of dynamic data structures , 2010, PLDI '10.

[17]  Alan Mycroft,et al.  A lightweight in-place implementation for software thread-level speculation , 2009, SPAA '09.

[18]  Diego R. Llanos Ferraris,et al.  Toward efficient and robust software speculative parallelization on multiprocessors , 2003, PPoPP '03.

[19]  Arturo González-Escribano,et al.  Robust thread-level speculation , 2011, 2011 18th International Conference on High Performance Computing.

[20]  Arturo González-Escribano,et al.  Squashing Alternatives for Software-Based Speculative Parallelization , 2014, IEEE Transactions on Computers.

[21]  Alireza Khotanzad,et al.  Invariant Image Recognition by Zernike Moments , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Roland T. Chin,et al.  On Image Analysis by the Methods of Moments , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Lawrence Rauchwerger,et al.  The R-LRPD test: speculative parallelization of partially parallel loops , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[24]  Rajiv Gupta,et al.  Effective parallelization of loops in the presence of I/O operations , 2012, PLDI.

[25]  Mikel Luján,et al.  Optimizing software runtime systems for speculative parallelization , 2013, TACO.

[26]  Peng Jia-xiong,et al.  Invariance analysis of improved Zernike moments , 2002 .

[27]  Kunle Olukotun,et al.  Improving the performance of speculatively parallel applications on the Hydra CMP , 1999 .

[28]  Diego R. Llanos Thread-level Speculative Parallelization , 2005 .

[29]  Gurindar S. Sohi,et al.  Speculative versioning cache , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[30]  Chen Ding,et al.  Fast Track: A Software System for Speculative Program Optimization , 2009, 2009 International Symposium on Code Generation and Optimization.

[31]  Álvaro Estébanez López Improving the Perfomance of a Pointer-Based, Speculative Parallelization Scheme , 2013 .

[32]  Håkan Grahn,et al.  Using speculation to enhance javascript performance in web applications , 2013, IEEE Internet Computing.

[33]  Josep Torrellas,et al.  Architectural support for scalable speculative parallelization in shared-memory multiprocessors , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[34]  Demetri Psaltis,et al.  Recognitive Aspects of Moment Invariants , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  M. Teague Image analysis via the general theory of moments , 1980 .

[36]  Rajiv Gupta,et al.  Copy or Discard execution model for speculative parallelization on multicores , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.