SoC-TM: Integrated HW/SW support for transactional memory programming on embedded MPSoCs

Two overriding concerns in the development of embedded MPSoCs are ease of programming and hardware complexity. In this paper we present SoC-TM, an integrated HW/SW solution for transactional programming on embedded MP-SoCs. Our proposal leverages a Hardware Transactional Memory (HTM) design, based on a dedicated HW module for conflict management, whose functionality is exposed to the software through compiler directives, implemented as an extension to the popular OpenMP programming model. To further improve ease of programming, our framework supports speculative parallelism, thanks to the ability of enforcing a given commit order in hardware. Our experimental results confirm that SoC-TM is a viable and cost-effective solution for embedded MPSoCs, in terms of energy, performance and productivity.

[1]  David A. Wood,et al.  LogTM-SE: Decoupling Hardware Transactional Memory from Caches , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[2]  Scott A. Mahlke,et al.  Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory , 2009, PLDI '09.

[3]  Michael L. Scott,et al.  Implementation tradeoffs in the design of flexible transactional memory support , 2010, J. Parallel Distributed Comput..

[4]  Kunle Olukotun,et al.  The OpenTM Transactional Application Programming Interface , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[5]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[6]  Luis Ceze,et al.  Implicit parallelism with ordered transactions , 2007, PPoPP.

[7]  Milind Girkar,et al.  On the exploitation of loop-level parallelism in embedded applications , 2009, TECS.

[8]  Maurice Herlihy,et al.  Embedded-TM: Energy and complexity-effective hardware transactional memory for embedded multicore systems , 2010, J. Parallel Distributed Comput..

[9]  Daniel Sánchez,et al.  Implementing Signatures for Transactional Memory , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[10]  Luca Benini,et al.  An OpenMP Compiler for Efficient Use of Distributed Scratchpad Memory in MPSoCs , 2012, IEEE Transactions on Computers.

[11]  Tao Zhang,et al.  Supporting OpenMP on Cell , 2008, International Journal of Parallel Programming.

[12]  James R. Larus,et al.  Transactional Memory, 2nd edition , 2010, Transactional Memory.

[13]  Gustavo Girão,et al.  Evaluation of a hardware transactional memory model in an NoC-based embedded MPSoC , 2010, SBCCI '10.

[14]  Feng Liu,et al.  Extending OpenMP for heterogeneous chip multiprocessors , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[15]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[16]  Kunle Olukotun,et al.  Exposing speculative thread parallelism in SPEC2000 , 2005, PPoPP.

[17]  Scott A. Mahlke,et al.  Uncovering hidden loop level parallelism in sequential applications , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[18]  Emilio L. Zapata,et al.  Improving Signatures by Locality Exploitation for Transactional Memory , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[19]  Josep Torrellas,et al.  Bulk Disambiguation of Speculative Threads in Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[20]  Eduard Ayguadé,et al.  Transactional Memory and OpenMP , 2007, IWOMP.

[21]  Stark C. Draper,et al.  Notary: Hardware techniques to enhance signatures , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[22]  Kunle Olukotun,et al.  STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.

[23]  Quentin L. Meunier,et al.  Lightweight Transactional Memory systems for NoCs based architectures: Design, implementation and comparison of two policies , 2010, J. Parallel Distributed Comput..