Towards green GPUs: Warp size impact analysis
暂无分享,去创建一个
[1] Sanjay J. Patel,et al. Tradeoffs in designing accelerator architectures for visual computing , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[2] Ahmad Khonsari,et al. Dynamic warp resizing: Analysis and benefits in high-performance SIMT , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).
[3] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[4] Norman P. Jouppi,et al. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[5] Ahmad Khonsari,et al. Warp size impact in GPUs: large or small? , 2013, GPGPU@ASPLOS.
[6] Tor M. Aamodt,et al. Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[7] Erik Lindholm,et al. NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.
[8] Nicolas Brunie,et al. Simultaneous branch and warp interweaving for sustained GPU performance , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[9] Tor M. Aamodt,et al. Thread block compaction for efficient SIMT control flow , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[10] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[11] Andrew B. Kahng,et al. ORION 2.0: A Power-Area Simulator for Interconnection Networks , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[12] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[13] Amirali Baniasadi,et al. Performance in GPU Architectures: Potentials and Distances , 2011 .
[14] Matei Ripeanu,et al. Size Matters: Space/Time Tradeoffs to Improve GPGPU Applications Performance , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[15] Onur Mutlu,et al. Improving GPU performance via large warps and two-level warp scheduling , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[16] Kevin Skadron,et al. Dynamic warp subdivision for integrated branch and memory divergence tolerance , 2010, ISCA.