A Novel Concurrent Validation Scheme for Hardware Transactional Memory

Transactional memory is a lock-free parallel programming model, which aims at replacing conventional lock-based threaded programming techniques, currently used by multi-core systems. These techniques are difficult to implement and impose unnecessary overheads caused by conservative programming practices. In this thesis, the scalability potential of a transactional memory system, called TMFab, was explored for different numbers of processors and it was concluded that for more than 4 processors the system presents reduced scalability, due to an increase in the validation overhead. In response to this observation, a novel validation scheme was proposed which reduces this overhead, first by allowing multiple transactions to perform their validations and commit operations concurrently, and second by removing the need for broadcasting messages between the active transactions. A distributed shared memory scheme was used to increase the validation and memory access throughput, as well as allow for transactions to commit concurrently on different memory partitions. The two architectures were compared by means of SystemC simulation, and a maximum of 2.5x validation speedup was observed for the modified design, together with a 2.7x reduction in memory access latency. In total, the modified design achieved a maximum execution speedup of 30% over the original, for the benchmarks that were used. Furthermore, the modified system guarantees sequential consistency even in corner case scenarios.

[1]  Eby G. Friedman,et al.  3-D Topologies for Networks-on-Chip , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[2]  Sumeet S. Kumar,et al.  A 3D Network-on-Chip for stacked-die transactional chip multiprocessors using Through Silicon Vias , 2011, 2011 6th International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DTIS).

[3]  Maged M. Michael,et al.  RingSTM: scalable transactions with a single atomic instruction , 2008, SPAA '08.

[4]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[5]  Tarek S. Abdelrahman,et al.  Hardware Support for Relaxed Concurrency Control in Transactional Memory , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[6]  Kunle Olukotun,et al.  STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.

[7]  Luca Benini,et al.  Contrasting a NoC and a Traditional Interconnect Fabric with Layout Awareness , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[8]  James R. Goodman,et al.  Transactional lock-free execution of lock-based programs , 2002, ASPLOS X.

[9]  Kunle Olukotun,et al.  Transactional memory coherence and consistency , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[10]  Babak Falsafi,et al.  Cuckoo directory: A scalable directory for many-core systems , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[11]  Sanjay J. Patel,et al.  WAYPOINT: scaling coherence to thousand-core architectures , 2010, PACT '10.

[12]  Mateo Valero,et al.  RMS-TM: a comprehensive benchmark suite for transactional memory systems , 2011, ICPE '11.

[13]  S. S. Kumar TMFab: A Transactional Memory Fabric for Chip Multiprocessors , 2010 .

[14]  David A. Patterson,et al.  Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .

[15]  Sandhya Dwarkadas,et al.  SPACE: Sharing pattern-based directory coherence for multicore scalability , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[16]  Andrew Brownsword,et al.  Hardware transactional memory for GPU architectures , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[17]  Kunle Olukotun,et al.  A Scalable, Non-blocking Approach to Transactional Memory , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.