Unified on-chip memory allocation for SIMT architecture
暂无分享,去创建一个
[1] Gregory J. Chaitin,et al. Register allocation & spilling via graph coloring , 1982, SIGPLAN '82.
[2] Vivek Sarkar,et al. Linear scan register allocation , 1999, TOPL.
[3] John Cocke,et al. Register Allocation Via Coloring , 1981, Comput. Lang..
[4] William J. Dally,et al. Unifying Primary Cache, Scratch, and Register File Memories in a Throughput Processor , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[5] John Cocke,et al. A methodology for the real world , 1981 .
[6] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[7] Gagan Agrawal,et al. An integer programming framework for optimizing shared memory use on GPUs , 2010, 2010 International Conference on High Performance Computing.
[8] Paola Batistoni,et al. International Conference , 2001 .
[9] Karthikeyan Sankaralingam,et al. iGPU: Exception support and speculative execution on GPUs , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[10] Andrew W. Appel,et al. Optimal spilling for CISC machines with few registers , 2001, PLDI '01.
[11] Ivan D. Baev. Techniques for Region-Based Register Allocation , 2009, 2009 International Symposium on Code Generation and Optimization.
[12] Rajeev Barua,et al. Recursive function data allocation to scratch-pad memory , 2007, CASES '07.
[13] G. Edward Suh,et al. SRAM-DRAM hybrid memory with applications to efficient register files in fine-grained multi-threading , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[14] Fernando Magno Quintão Pereira,et al. Spill Code Placement for SIMD Machines , 2012, SBLP.
[15] Thomas R. Gross,et al. Call-cost directed register allocation , 1997, PLDI '97.
[16] Jens Palsberg,et al. Register Allocation via Coloring of Chordal Graphs , 2005, APLAS.
[17] Fred C. Chow. Minimizing register usage penalty at procedure calls , 1988, PLDI '88.
[18] Hwansoo Han,et al. Optimal register reassignment for register stack overflow minimization , 2006, TACO.
[19] Ken Kennedy,et al. Vector Register Allocation , 1992, IEEE Trans. Computers.
[20] Jian Wang,et al. Software pipelining with register allocation and spilling , 1994, MICRO 27.
[21] Yi Yang,et al. Shared memory multiplexing: A novel way to improve GPGPU throughput , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[22] Richard W. Vuduc,et al. A performance analysis framework for identifying potential benefits in GPGPU applications , 2012, PPoPP '12.
[23] William J. Dally,et al. A Hierarchical Thread Scheduler and Register File for Energy-Efficient Throughput Processors , 2012, TOCS.
[24] Frances E. Allen,et al. Proceedings of the 1982 SIGPLAN symposium on Compiler construction , 1982 .
[25] Rajeev Barua,et al. Dynamic allocation for scratch-pad memory using compile-time decisions , 2006, TECS.
[26] Josep Llosa,et al. Hypernode reduction modulo scheduling , 1995, MICRO 28.
[27] Sebastian Hack,et al. Register allocation for programs in SSA form , 2006, CC.
[28] Joseph S. Sventek,et al. Efficient dynamic heap allocation of scratch-pad memory , 2008, ISMM '08.