CASPAR: Breaking Serialization in Lock-Free Multicore Synchronization
暂无分享,去创建一个
[1] Harry F. Jordan. Performance measurements on HEP - a pipelined MIMD computer , 1983, ISCA '83.
[2] Nir Shavit. Data structures in the multicore age , 2011, CACM.
[3] Douglas Thain,et al. Qthreads: An API for programming with millions of lightweight threads , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[4] Dan Alistarh,et al. The SprayList: a scalable relaxed priority queue , 2015, PPoPP.
[5] D. Lenoski,et al. The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[6] Richard E. Jones,et al. The Garbage Collection Handbook: The art of automatic memory management , 2011, Chapman and Hall / CRC Applied Algorithms and Data Structures Series.
[7] Kevin P. McAuliffe,et al. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.
[8] Dharmendra S. Modha,et al. CAR: Clock with Adaptive Replacement , 2004, FAST.
[9] Maurice Herlihy,et al. Wait-free synchronization , 1991, TOPL.
[10] Allan Porterfield,et al. OpenMP task scheduling strategies for multicore NUMA systems , 2012, Int. J. High Perform. Comput. Appl..
[11] James R. Goodman,et al. Efficient Synchronization: Let Them Eat QOLB , 1997, International Symposium on Computer Architecture.
[12] Keshav Pingali,et al. Optimistic parallelism requires abstractions , 2007, PLDI '07.
[13] Nir Shavit,et al. Skiplist-based concurrent priority queues , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[14] Yehuda Afek,et al. Quasi-Linearizability: Relaxed Consistency for Improved Concurrency , 2010, OPODIS.
[15] Nicholas D. Matsakis,et al. The rust language , 2014, HILT '14.
[16] Tarek S. Abdelrahman,et al. Hardware Support for Relaxed Concurrency Control in Transactional Memory , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[17] Anoop Gupta,et al. The Stanford Dash multiprocessor , 1992, Computer.
[18] Silas Boyd-Wickizer,et al. OpLog: a library for scaling update-heavy data structures , 2014 .
[19] Nir Shavit,et al. The Baskets Queue , 2007, OPODIS.
[20] Ralph Grishman,et al. The NYU ultracomputer—designing a MIMD, shared-memory parallel machine , 2018, ISCA '98.
[21] Per-Åke Larson,et al. Memory allocation for long-running server applications , 1998, ISMM '98.
[22] Nir Shavit,et al. A scalable lock-free stack algorithm , 2004, SPAA '04.
[23] Marc Shapiro,et al. A study of the scalability of stop-the-world garbage collectors on multicores , 2013, ASPLOS '13.
[24] Alejandro Duran,et al. Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP , 2009, 2009 International Conference on Parallel Processing.
[25] Ralph Grishman,et al. The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.
[26] Maged M. Michael,et al. Nonblocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors , 1998, J. Parallel Distributed Comput..
[27] Josep Torrellas,et al. The impact of speeding up critical sections with data prefetching and forwarding , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.
[28] Tarek S. Abdelrahman,et al. Relaxing concurrency control in transactional memory , 2011 .
[29] Ana Sokolova,et al. Distributed queues in shared memory: multicore performance and scalability through quantitative relaxation , 2013, CF '13.
[30] Milo M. K. Martin,et al. RETCON: transactional repair without replay , 2010, ISCA '10.
[31] Jim Jeffers. Intel® Xeon Phi™ Coprocessors , 2013 .
[32] Michael E. Thomadakis,et al. The Architecture of the Nehalem Processor and Nehalem-EP SMP Platforms , 2011 .
[33] Don Marti,et al. OSv - Optimizing the Operating System for Virtual Machines , 2014, USENIX Annual Technical Conference.
[34] D. M. Hutton,et al. The Art of Multiprocessor Programming , 2008 .
[35] Mateo Valero,et al. Architectural Support for Fair Reader-Writer Locking , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[36] Emmett Witchel,et al. Dependence-aware transactional memory for increased concurrency , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[37] Maged M. Michael. Hazard pointers: safe memory reclamation for lock-free objects , 2004, IEEE Transactions on Parallel and Distributed Systems.
[38] James R. Goodman,et al. Inferential Queueing and Speculative Push , 2003, ICS '03.
[39] T. N. Vijaykumar,et al. Wait-n-GoTM: improving HTM performance by serializing cyclic dependencies , 2013, ASPLOS '13.
[40] Maged M. Michael. Scalable lock-free dynamic memory allocation , 2004, PLDI '04.
[41] Josep Torrellas,et al. BulkSMT: Designing SMT processors for atomic-block execution , 2012, IEEE International Symposium on High-Performance Comp Architecture.
[42] Jaejin Lee,et al. SFMalloc: A Lock-Free and Mostly Synchronization-Free Dynamic Memory Allocator for Manycores , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[43] Edward S. Davidson,et al. The Cedar system and an initial performance study , 1998, ISCA '98.
[44] Ana Sokolova,et al. Performance, Scalability, and Semantics of Concurrent FIFO Queues , 2012, ICA3PP.
[45] G ValiantLeslie. A bridging model for parallel computation , 1990 .
[46] James R. Goodman,et al. Improving the throughput of synchronization by insertion of delays , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[47] Keir Fraser,et al. Practical lock-freedom , 2003 .
[48] Mary K. Vernon,et al. Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS III.
[49] Josep Torrellas,et al. OmniOrder: Directory-based conflict serialization of transactions , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[50] Erez Petrank,et al. Wait-free queues with multiple enqueuers and dequeuers , 2011, PPoPP '11.
[51] Craig Freedman,et al. Hekaton: SQL server's memory-optimized OLTP engine , 2013, SIGMOD '13.
[52] Stefanos Kaxiras,et al. Complexity-effective multicore coherence , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[53] Steven L. Scott,et al. Synchronization and communication in the T3E multiprocessor , 1996, ASPLOS VII.
[54] Dimitrios S. Nikolopoulos,et al. Scalable locality-conscious multithreaded memory allocation , 2006, ISMM '06.
[55] Maged M. Michael,et al. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms , 1996, PODC '96.
[56] Luís E. T. Rodrigues,et al. Virtues and limitations of commodity hardware transactional memory , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[57] Michael Stonebraker,et al. Enterprise Database Applications and the Cloud: A Difficult Road Ahead , 2014, 2014 IEEE International Conference on Cloud Engineering.
[58] Jeffrey H. Meyerson,et al. The Go Programming Language , 2014, IEEE Softw..
[59] Christoph M. Kirsch,et al. Fast and Scalable, Lock-Free k-FIFO Queues , 2013, PaCT.
[60] Lieven Eeckhout,et al. Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[61] Brian W. Kernighan,et al. The Go Programming Language , 2015 .
[62] Allan Porterfield,et al. The Tera computer system , 1990, ICS '90.
[63] Josep Torrellas,et al. Data forwarding in scalable shared-memory multiprocessors , 1995, ICS '95.
[64] Keshav Pingali,et al. A lightweight infrastructure for graph analytics , 2013, SOSP.
[65] Nir Shavit,et al. An optimistic approach to lock-free FIFO queues , 2004, Distributed Computing.
[66] Ronald G. Dreslinski,et al. Proactive transaction scheduling for contention management , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).