Worklist-Directed Prefetching
暂无分享,去创建一个
Derek Chiou | Dan Zhang | Xiaoyu Ma | Derek Chiou | Dan Zhang | Xiaoyu Ma
[1] Christos Kozyrakis,et al. Flexible architectural support for fine-grain scheduling , 2010, ASPLOS 2010.
[2] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.
[3] Keshav Pingali,et al. The tao of parallelism in algorithms , 2011, PLDI '11.
[4] Srinivas Devadas,et al. IMP: Indirect memory prefetcher , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[5] Calvin Lin,et al. Linearizing irregular memory accesses for improved correlated prefetching , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[6] Keshav Pingali,et al. A lightweight infrastructure for graph analytics , 2013, SOSP.
[7] Onur Mutlu,et al. Runahead execution: an alternative to very large instruction windows for out-of-order processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[8] Cong Yan,et al. A scalable architecture for ordered parallelism , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[9] L. R. Ford,et al. NETWORK FLOW THEORY , 1956 .
[10] Ulrich Meyer,et al. [Delta]-stepping: a parallelizable shortest path algorithm , 2003, J. Algorithms.
[11] Pierre Michaud,et al. A case for (partially) TAgged GEometric history length branch prediction , 2006, J. Instr. Level Parallelism.
[12] Keshav Pingali,et al. Ordered vs. unordered: a comparison of parallelism and work-efficiency in irregular algorithms , 2011, PPoPP '11.
[13] Donald E. Knuth,et al. A Generalization of Dijkstra's Algorithm , 1977, Inf. Process. Lett..
[14] Babak Falsafi,et al. Meet the walkers accelerating index traversals for in-memory databases , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[15] Yale N. Patt,et al. Filtered runahead execution with a runahead buffer , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[16] Kiyoung Choi,et al. A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[17] Christoforos E. Kozyrakis,et al. Flexible architectural support for fine-grain scheduling , 2010, ASPLOS XV.
[18] James E. Smith,et al. Data Cache Prefetching Using a Global History Buffer , 2005, IEEE Micro.
[19] Christopher J. Hughes,et al. Carbon: architectural support for fine-grained parallelism on chip multiprocessors , 2007, ISCA '07.
[20] Sam Ainsworth,et al. Graph Prefetching Using Data Structure Knowledge , 2016, ICS.
[21] Babak Falsafi,et al. Asynchronous Memory Access Chaining , 2015, Proc. VLDB Endow..
[22] Scott A. Mahlke,et al. Accelerating asynchronous programs through Event Sneak Peek , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[23] Keshav Pingali,et al. Priority Queues Are Not Good Concurrent Priority Schedulers , 2015, Euro-Par.
[24] Christoforos E. Kozyrakis,et al. ZSim: fast and accurate microarchitectural simulation of thousand-core systems , 2013, ISCA.