The Macro-DSE for HPC Processing Unit: The Physical Constraints Perspective
暂无分享,去创建一个
Yuxing Tang | Lei Wang | Yu Deng | Xiaoqiang Ni | Qiang Dou | L. Wang | Q. Dou | Yuxing Tang | Xiaoqiang Ni | Yu Deng
[1] William J. Dally,et al. The GPU Computing Era , 2010, IEEE Micro.
[2] David Blaauw,et al. Centip3De: a many-core prototype exploring 3D integration and near-threshold computing , 2013, CACM.
[3] Jung Ho Ahn,et al. The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing , 2013, TACO.
[4] Gu-Yeon Wei,et al. Quantifying sources of error in McPAT and potential impacts on architectural studies , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[5] Tor M. Aamodt,et al. Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[6] Ankur Srivastava,et al. Unlocking the true potential of 3D CPUs with micro-fluidic cooling , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).
[7] Daniel A. Brokenshire,et al. Introduction to the Cell Broadband Engine Architecture , 2007, IBM J. Res. Dev..
[8] Mitsumasa Koyanagi,et al. Heterogeneous 3D integration — Technology enabler toward future super-chip , 2013, 2013 IEEE International Electron Devices Meeting.
[9] Jaewon Lee,et al. RpStacks: Fast and Accurate Processor Design Space Exploration Using Representative Stall-Event Stacks , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[10] Phillip B. Gibbons. Big data: Scale down, scale up, scale out , 2015, IPDPS.
[11] Denis Foley,et al. A Low-Power Integrated x86-64 and Graphics Processor for Mobile Computing Devices , 2012, IEEE J. Solid State Circuits.
[12] Chris Zhang,et al. SeaMicro SM10000-64 server: Building datacenter servers using cell phone chips , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).
[13] Ming Yang,et al. Sonic Millip3De: A massively parallel 3D-stacked accelerator for 3D ultrasound , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[14] Yuxing Tang,et al. A Scalable and Fast Microprocessor Design Space Exploration Methodology , 2015, 2015 IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip.
[15] Balaram Sinharoy,et al. POWER4 system microarchitecture , 2002, IBM J. Res. Dev..
[16] Franz Franchetti,et al. Data reorganization in memory using 3D-stacked DRAM , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[17] Lieven Eeckhout,et al. Chip Multiprocessor Design Space Exploration through Statistical Simulation , 2009, IEEE Transactions on Computers.
[18] Karthikeyan Sankaralingam,et al. ISA Wars , 2015, ACM Trans. Comput. Syst..
[19] Mateo Valero,et al. Supercomputing with commodity CPUs: Are mobile SoCs ready for HPC? , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[20] Michael F. P. O'Boyle,et al. Microarchitectural Design Space Exploration Using an Architecture-Centric Approach , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[21] Pradeep Dubey,et al. Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.
[22] Nam Sung Kim,et al. GPUWattch: enabling energy optimizations in GPGPUs , 2013, ISCA.