Dynamic Thread Mapping Based on Machine Learning for Transactional Memory Applications

Thread mapping is an appealing approach to efficiently exploit the potential of modern chip-multiprocessors. However, efficient thread mapping relies upon matching the behavior of an application with system characteristics. In particular, Software Transactional Memory (STM) introduces another dimension due to its runtime system support. In this work, we propose a dynamic thread mapping approach to automatically infer a suitable thread mapping strategy for transactional memory applications composed of multiple execution phases with potentially different transactional behavior in each phase. At runtime, it profiles the application at specific periods and consults a decision tree generated by a Machine Learning algorithm to decide if the current thread mapping strategy should be switched to a more adequate one. We implemented this approach in a state-of-the-art STM system, making it transparent to the user. Our results show that the proposed dynamic approach presents performance improvements up to 31% compared to the best static solution.

[1]  Kunle Olukotun,et al.  STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.

[2]  Michael F. P. O'Boyle,et al.  Mapping parallelism to multi-cores: a machine learning based approach , 2009, PPoPP '09.

[3]  Guillaume Mercier,et al.  hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[4]  Jin Zhang,et al.  Process Mapping for MPI Collective Communications , 2009, Euro-Par.

[5]  Mahmut T. Kandemir,et al.  Process variation aware thread mapping for Chip Multiprocessors , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[6]  Kunle Olukotun,et al.  Eigenbench: A simple exploration tool for orthogonal TM characteristics , 2010, IEEE International Symposium on Workload Characterization (IISWC'10).

[7]  Jean-François Méhaut,et al.  A machine learning-based approach for thread mapping on transactional memory applications , 2011, 2011 18th International Conference on High Performance Computing.

[8]  Henk Sips,et al.  Euro-Par 2009 Parallel Processing, 15th International Euro-Par Conference, Delft, The Netherlands, August 25-28, 2009. Proceedings , 2009, Euro-Par.

[9]  Michael F. P. O'Boyle,et al.  Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping , 2009, PLDI '09.

[10]  Jean-François Méhaut,et al.  Analysis and Tracing of Applications Based on Software Transactional Memory on Multicore Architectures , 2011, 2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing.

[11]  Philippe Olivier Alexandre Navaux,et al.  Evaluating Thread Placement Based on Memory Access Patterns for Multi-core Processors , 2010, 2010 IEEE 12th International Conference on High Performance Computing and Communications (HPCC).

[12]  Jack J. Dongarra,et al.  Collecting Performance Data with PAPI-C , 2009, Parallel Tools Workshop.

[13]  James R. Larus,et al.  Transactional Memory , 2006, Transactional Memory.

[14]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[15]  Torvald Riegel,et al.  Dynamic performance tuning of word-based software transactional memory , 2008, PPoPP.

[16]  Michael F. P. O'Boyle,et al.  A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL , 2011, CC.