Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip
暂无分享,去创建一个
[1] Michael L. Scott,et al. Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.
[2] Larry Rudolph,et al. Dynamic decentralized cache schemes for mimd parallel processors , 1984, ISCA '84.
[3] Eduard Ayguadé,et al. Optimizing NANOS OpenMP for the IBM Cyclops multithreaded architecture , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[4] G. Gao,et al. FAST : A Functionally Accurate Simulation Toolset for the Cyclops 64 Cellular Architecture , 2005 .
[5] Maged M. Michael. Hazard pointers: safe memory reclamation for lock-free objects , 2004, IEEE Transactions on Parallel and Distributed Systems.
[6] Maurice Herlihy,et al. Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[7] Maged M. Michael,et al. High performance dynamic lock-free hash tables and list-based sets , 2002, SPAA '02.
[8] Barbara M. Chapman,et al. Performance Comparisons of Basic OpenMP Constructs , 2002, ISHPC.
[9] Timothy L. Harris,et al. A Pragmatic Implementation of Non-blocking Linked-Lists , 2001, DISC.
[10] Dennis Shasha,et al. Concurrent set manipulation without locking , 1988, PODS '88.
[11] Thomas E. Anderson,et al. The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..
[12] Rudolf Berrendorf,et al. Performance characteristics for OpenMP constructs on different parallel computer architectures , 2000 .
[13] Maurice Herlihy,et al. Nonblocking memory management support for dynamic-sized data structures , 2005, TOCS.
[14] J. M. Bull,et al. Measuring Synchronisation and Scheduling Overheads in OpenMP , 2007 .
[15] Shreekant S. Thakkar,et al. Synchronization algorithms for shared-memory multiprocessors , 1990, Computer.
[16] José E. Moreira,et al. Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflops Computer , 2002, International Journal of Parallel Programming.
[17] Nir Shavit,et al. A scalable lock-free stack algorithm , 2004, SPAA '04.
[18] Maged M. Michael,et al. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms , 1996, PODC '96.
[19] David R. Cheriton,et al. Non-blocking synchronization and system design , 1999 .
[20] Mitsuhisa Sato,et al. Performance Evaluation of the Omni OpenMP Compiler , 2000, ISHPC.
[21] Guang R. Gao,et al. Toward a Software Infrastructure for the Cyclops-64 Cellular Architecture , 2006, 20th International Symposium on High-Performance Computing in an Advanced Collaborative Environment (HPCS'06).
[22] Ying Qian,et al. Performance characteristics of openMP constructs, and application benchmarks on a large symmetric multiprocessor , 2003, ICS '03.
[23] Eduard Ayguadé,et al. Evaluation of OpenMP for the Cyclops Multithreaded Architecture , 2003, WOMPAT.
[24] Maged M. Michael. CAS-Based Lock-Free Algorithm for Shared Deques , 2003, Euro-Par.
[25] John D. Valois. Lock-free linked lists using compare-and-swap , 1995, PODC '95.
[26] Sanjeev Kumar,et al. Evaluating synchronization on shared address space multiprocessors: methodology and performance , 1999, SIGMETRICS '99.