Smartlocks: lock acquisition scheduling for self-aware synchronization

As multicore processors become increasingly prevalent, system complexity is skyrocketing. The advent of the asymmetric multicore compounds this - it is no longer practical for an average programmer to balance the system constraints associated with today's multicores and worry about new problems like asymmetric partitioning and thread interference. Adaptive, or self-aware, computing has been proposed as one method to help application and system programmers confront this complexity. These systems take some of the burden off of programmers by monitoring themselves and optimizing or adapting to meet their goals. This paper introduces a self-aware synchronization library for multicores and asymmetric multicores called Smartlocks. Smartlocks is a spin-lock library that adapts its internal implementation during execution using heuristics and machine learning to optimize toward a user-defined goal, which may relate to performance or problem-specific criteria. Smartlocks builds upon adaptation techniques from prior work like reactive locks [1], but introduces a novel form of adaptation that we term lock acquisition scheduling designed specifically to address asymmetries in multicores. Lock acquisition scheduling is optimizing which waiter will get the lock next for the best long-term effect when multiple threads (or processes) are spinning for a lock. This work demonstrates that lock scheduling is important for addressing asymmetries in multicores. We study scenarios where core speeds vary both dynamically and intrinsically under thermal throttling and manufacturing variability, respectively, and we show that Smartlocks significantly outperforms conventional spin-locks and reactive locks. Based on our findings, we provide guidelines for application scenarios where Smartlocks works best versus less optimally.

[1]  Marco D. Santambrogio,et al.  Application Heartbeats. A Generic Interface for Expressing Performance Goals and Progress in Self-Tuning Systems , 2010 .

[2]  James R. Goodman,et al.  Efficient Synchronization: Let Them Eat QOLB , 1997, International Symposium on Computer Architecture.

[3]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[4]  Sridhar Mahadevan,et al.  Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.

[5]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[6]  Karsten Schwan,et al.  Implementation of scalable blocking locks using an adaptive thread scheduler , 1996, Proceedings of International Conference on Parallel Processing.

[7]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[8]  Marina Papatriantafilou,et al.  Reactive spin-locks: a self-tuning approach , 2005, 8th International Symposium on Parallel Architectures,Algorithms and Networks (ISPAN'05).

[9]  Erik Hagersten,et al.  Efficient Synchronization for Nonuniform Communication Architectures , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[10]  Onur Mutlu,et al.  Self-Optimizing Memory Controllers: A Reinforcement Learning Approach , 2008, 2008 International Symposium on Computer Architecture.

[11]  Anna R. Karlin,et al.  Empirical studies of competitve spinning for a shared-memory multiprocessor , 1991, SOSP '91.

[12]  Erik Hagersten,et al.  Queue locks on cache coherent multiprocessors , 1994, Proceedings of 8th International Parallel Processing Symposium.

[13]  Martin C. Rinard,et al.  Eliminating synchronization overhead in automatically parallelized programs using dynamic feedback , 1999, TOCS.

[14]  Engin Ipek,et al.  Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[15]  Michael L. Scott,et al.  Scheduler-conscious synchronization , 1997, TOCS.

[16]  Beng-Hong Lim,et al.  Reactive synchronization algorithms for multiprocessors , 1994, ASPLOS VI.

[17]  Theodore Johnson,et al.  A Prioritized Multiprocessor Spin Lock , 1997, IEEE Trans. Parallel Distributed Syst..

[18]  Michael L. Scott,et al.  Scalable reader-writer synchronization for shared-memory multiprocessors , 1991, PPOPP '91.

[19]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[20]  Michael L. Scott,et al.  Synchronization without contention , 1991, ASPLOS IV.

[21]  Hiroaki Takada,et al.  Priority inheritance spin locks for multiprocessor real-time systems , 1996, Proceedings Second International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'96).