Multi-stage coordinated prefetching for present-day processors
暂无分享,去创建一个
[1] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[2] Ken Kennedy,et al. Software prefetching , 1991, ASPLOS IV.
[3] Emre Kultursay,et al. Compiler-Based Data Prefetching and Streaming Non-temporal Store Generation for the Intel(R) Xeon Phi(TM) Coprocessor , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[4] Eric M. Schwarz,et al. IBM POWER6 microarchitecture , 2007, IBM J. Res. Dev..
[5] Kathryn S. McKinley,et al. Guided region prefetching: a cooperative hardware/software approach , 2003, ISCA '03.
[6] Donald Yeung,et al. Design and evaluation of compiler algorithms for pre-execution , 2002, ASPLOS X.
[7] Jean-Loup Baer,et al. A performance study of software and hardware data prefetching schemes , 1994, ISCA '94.
[8] Henry M. Levy,et al. An Architecture for Software-Controlled Data Prefetching , 1991, ISCA.
[9] Wei-Chung Hsu,et al. Data Prefetching On The HP PA-8000 , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[10] Mahmut T. Kandemir,et al. A compiler-directed data prefetching scheme for chip multiprocessors , 2009, PPoPP '09.
[11] Richard W. Vuduc,et al. When Prefetching Works, When It Doesn’t, and Why , 2012, TACO.
[12] Fuat Keceli,et al. Resource-Aware Compiler Prefetching for Many-Cores , 2010, 2010 Ninth International Symposium on Parallel and Distributed Computing.
[13] Dean M. Tullsen,et al. Inter-core prefetching for multicore processors using migrating helper threads , 2011, ASPLOS XVI.
[14] Markus Schordan,et al. The Specification of Source-to-Source Transformations for the Compile-Time Optimization of Parallel Object-Oriented Scientific Applications , 2001, LCPC.
[15] Daeyeon Park,et al. Improving the effectiveness of software prefetching with adaptive executions , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.
[16] Donald Yeung,et al. The Efficacy of Software Prefetching and Locality Optimizations on Future Memory Systems , 2004, J. Instr. Level Parallelism.