Accelerating Critical Section Execution with Asymmetric Multicore Architectures

Contention for critical sections can reduce performance and scalability by causing thread serialization. The proposed accelerated critical sections mechanism reduces this limitation. ACS executes critical sections on the high-performance core of an asymmetric chip multiprocessor (ACMP), which can execute them faster than the smaller cores can.

[1]  Vijay S. Pai,et al.  The Interaction Of Software Prefetching With Ilp Processors In Shared-memory Systems , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[2]  Uri C. Weiser,et al.  Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors , 2006, IEEE Computer Architecture Letters.

[3]  E. L. Lawler,et al.  Branch-and-Bound Methods: A Survey , 1966, Oper. Res..

[4]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[5]  Josep Torrellas,et al.  Speculative synchronization: applying thread-level speculation to explicitly parallel applications , 2002, ASPLOS X.

[6]  Onur Mutlu,et al.  Accelerating critical section execution with asymmetric multi-core architectures , 2009, ASPLOS.

[7]  James R. Goodman,et al.  Transactional lock-free execution of lock-based programs , 2002, ASPLOS X.

[8]  Ravi Rajwar,et al.  Speculative lock elision: enabling highly concurrent multithreaded execution , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[9]  Engin Ipek,et al.  Core fusion: accommodating software diversity in chip multiprocessors , 2007, ISCA '07.

[10]  Andrew Birrell,et al.  Implementing Remote procedure calls , 1983, SOSP '83.

[11]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[12]  Surendar Chandra,et al.  Thread Migration to Improve Synchronization Performance , 2006 .

[13]  Norman P. Jouppi,et al.  Heterogeneous chip multiprocessors , 2005, Computer.

[14]  Mark D. Hill,et al.  Amdahl's Law in the Multicore Era , 2008 .

[15]  Hugh Garraway Parallel Computer Architecture: A Hardware/Software Approach , 1999, IEEE Concurrency.

[16]  Josep Torrellas,et al.  The impact of speeding up critical sections with data prefetching and forwarding , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[17]  John Paul Shen,et al.  Mitigating Amdahl's law through EPI throttling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).