D2.2 White-box methodologies, programming abstractions and libraries
暂无分享,去创建一个
Paul Renaud-Goud | Philippas Tsigas | Anders Gidenstam | Aras Atalar | Phuong Ha | Ibrahim Umar | Vi Tran
[1] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[2] B. Mandelbrot. FRACTAL ASPECTS OF THE ITERATION OF z →Λz(1‐ z) FOR COMPLEX Λ AND z , 1980 .
[3] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[4] Shirley Moore,et al. Measuring Energy and Power with PAPI , 2012, 2012 41st International Conference on Parallel Processing Workshops.
[5] Paul Renaud-Goud,et al. Models for energy consumption of data structures and algorithms , 2018, ArXiv.
[6] Philippas Tsigas,et al. Reactive multiword synchronization for multiprocessors , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.
[7] Beng-Hong Lim,et al. Reactive synchronization algorithms for multiprocessors , 1994, ASPLOS VI.
[8] Maged M. Michael. Hazard pointers: safe memory reclamation for lock-free objects , 2004, IEEE Transactions on Parallel and Distributed Systems.
[9] Bill Dally. Power, Programmability, and Granularity: The Challenges of ExaScale Computing , 2011, IPDPS.
[10] Michael A. Bender,et al. Cache-oblivious priority queue and graph algorithm applications , 2002, STOC '02.
[11] Roger Wattenhofer,et al. Efficient multi-word locking using randomization , 2005, PODC '05.
[12] Haim Kaplan,et al. CBTree: A Practical Concurrent Self-Adjusting Search Tree , 2012, DISC.
[13] John Giacomoni,et al. FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue , 2008, PPoPP.
[14] John Shalf,et al. The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..
[15] Michel Raynal,et al. A speculation‐friendly binary search tree , 2019, Concurr. Comput. Pract. Exp..
[16] Michael A. Bender,et al. Cache-oblivious streaming B-trees , 2007, SPAA '07.
[17] Kunle Olukotun,et al. A practical concurrent binary search tree , 2010, PPoPP '10.
[18] Jack J. Dongarra,et al. A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..
[19] S. B. Yao,et al. Efficient locking for concurrent operations on B-trees , 1981, TODS.
[20] Maurice Herlihy,et al. Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[21] Goetz Graefe,et al. A survey of B-tree locking techniques , 2010, TODS.
[22] Philippas Tsigas,et al. Wait-free Programming for General Purpose Computations on Graphics Processors , 2008, IPDPS.
[23] Nir Shavit,et al. Transactional Locking II , 2006, DISC.
[24] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2009, Parallel Comput..
[25] Gerth Stølting Brodal,et al. Cache oblivious search trees via binary trees of small height , 2001, SODA '02.
[26] John D. Valois. Implementing Lock-Free Queues , 1994 .
[27] Michael A. Bender,et al. Cache-oblivious B-trees , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.
[28] Robert E. Tarjan,et al. Amortized efficiency of list update and paging rules , 1985, CACM.
[29] Philippas Tsigas,et al. NOBLE : A Non-Blocking Inter-Process Communication Library , 2002 .
[30] Michael A. Bender,et al. Concurrent cache-oblivious b-trees , 2005, SPAA '05.
[31] Mark Moir,et al. Using elimination to implement scalable and lock-free FIFO queues , 2005, SPAA '05.
[32] Paul Renaud-Goud,et al. White-box methodologies, programming abstractions and libraries , 2018, ArXiv.
[33] Ronald L. Rivest,et al. Introduction to Algorithms, third edition , 2009 .
[34] Maurice Herlihy,et al. Nonblocking memory management support for dynamic-sized data structures , 2005, TOCS.
[35] Philippas Tsigas,et al. The Synchronization Power of Coalesced Memory Accesses , 2010, IEEE Transactions on Parallel and Distributed Systems.
[36] Erez Petrank,et al. A lock-free B+tree , 2012, SPAA '12.
[37] Anna R. Karlin,et al. Empirical studies of competitve spinning for a shared-memory multiprocessor , 1991, SOSP '91.
[38] Arne Andersson. Faster deterministic sorting and searching in linear space , 1996, Proceedings of 37th Conference on Foundations of Computer Science.
[39] Nir Shavit,et al. The Baskets Queue , 2007, OPODIS.
[40] Philippas Tsigas,et al. Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency , 2010, OPODIS.
[41] Yi Zhang,et al. A simple, fast and scalable non-blocking concurrent FIFO queue for shared memory multiprocessor systems , 2001, SPAA '01.
[42] Giuseppe Serazzi,et al. What to expect when you are consolidating: effective prediction models of application performance on multicores , 2013, Cluster Computing.
[43] Douglas Comer,et al. Ubiquitous B-Tree , 1979, CSUR.
[44] Harumi A. Kuno,et al. Modern B-tree techniques , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[45] Laurent Lefèvre,et al. A survey on techniques for improving the energy efficiency of large-scale distributed systems , 2014, ACM Comput. Surv..
[46] David A. Patterson,et al. A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness , 2013, ISCA.
[47] Rudolf Bayer,et al. Organization and maintenance of large ordered indexes , 1972, Acta Informatica.
[48] Faith Ellen,et al. Non-blocking binary search trees , 2010, PODC.
[49] Richard W. Vuduc,et al. A Roofline Model of Energy , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[50] Peter van Emde Boas,et al. Preserving order in a forest in less than logarithmic time , 1975, 16th Annual Symposium on Foundations of Computer Science (sfcs 1975).
[51] Rolf Fagerberg. Cache-Oblivious Model , 2008, Encyclopedia of Algorithms.
[52] Rajesh Gupta,et al. Evaluating the effectiveness of model-based power characterization , 2011 .
[53] Nir Shavit,et al. Reactive Diffracting Trees , 2000, J. Parallel Distributed Comput..
[54] Richard W. Vuduc,et al. Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[55] John David Valois. Lock-free data structures , 1996 .
[56] Philippas Tsigas,et al. NB-FEB: A Universal Scalable Easy-to-Use Synchronization Primitive for Manycore Architectures , 2009, OPODIS.
[57] Georg Ofenbeck,et al. Applying the roofline model , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[58] Pradeep Dubey,et al. FAST: fast architecture sensitive tree search on modern CPUs and GPUs , 2010, SIGMOD Conference.
[59] Rahul Khanna,et al. RAPL: Memory power estimation and capping , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).
[60] Marina Papatriantafilou,et al. Efficient and Reliable Lock-Free Memory Reclamation Based on Reference Counting , 2009, IEEE Transactions on Parallel and Distributed Systems.
[61] Marina Papatriantafilou,et al. Multiword atomic read/write registers on multiprocessor systems , 2009, JEAL.
[62] Marina Papatriantafilou,et al. Self-tuning reactive diffracting trees , 2007, J. Parallel Distributed Comput..
[63] Philippas Tsigas,et al. NOBLE: non-blocking programming support via lock-free shared abstract data types , 2009, CARN.
[64] Marina Papatriantafilou,et al. Efficient self-tuning spin-locks using competitive analysis , 2007, J. Syst. Softw..
[65] Alok Aggarwal,et al. The input/output complexity of sorting and related problems , 1988, CACM.
[66] Gerth Stølting Brodal,et al. Cache-Oblivious Algorithms and Data Structures , 2004, SWAT.
[67] David A. Patterson,et al. Direction-optimizing breadth-first search , 2012, HiPC 2012.
[68] Marina Papatriantafilou,et al. A lock-free algorithm for concurrent bags , 2011, SPAA '11.
[69] Maged M. Michael,et al. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms , 1996, PODC '96.
[70] Leslie Lamport,et al. Specifying Concurrent Program Modules , 1983, TOPL.
[71] Trevor Brown,et al. Non-blocking k-ary Search Trees , 2011, OPODIS.
[72] Phuong Hoai Ha,et al. DeltaTree: A Practical Locality-aware Concurrent Search Tree , 2013, ArXiv.
[73] Pradeep Dubey,et al. PALM: Parallel Architecture-Friendly Latch-Free Modifications to B+ Trees on Many-Core Processors , 2011, Proc. VLDB Endow..