Analyzing power efficiency of optimization techniques and algorithm design methods for applications on heterogeneous platforms
暂无分享,去创建一个
[1] D. Kaeli,et al. Low-cost Techniques for Reducing Branch Context Pollution in a Soft Realtime Embedded Multithreaded Processor , 2007, Symposium on Computer Architecture and High Performance Computing.
[2] William J. Dally,et al. The GPU Computing Era , 2010, IEEE Micro.
[3] George Varghese,et al. A 22nm IA multi-CPU and GPU System-on-Chip , 2012, 2012 IEEE International Solid-State Circuits Conference.
[4] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[5] F. Al-Shamali,et al. Author Biographies. , 2015, Journal of social work in disability & rehabilitation.
[6] Toshio Endo,et al. Bandwidth intensive 3-D FFT kernel for GPUs using CUDA , 2008, HiPC 2008.
[7] David R. Kaeli,et al. Multi2Sim: A simulation framework for CPU-GPU computing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[8] David R. Kaeli,et al. Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures , 2011, IEEE Transactions on Parallel and Distributed Systems.
[9] Reiji Suda,et al. Accurate Measurements and Precise Modeling of Power Dissipation of CUDA Kernels toward Power Optimized High Performance CPU-GPU Computing , 2009, 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies.
[10] Wayne Luk,et al. Power profiling and optimization for heterogeneous multi-core systems , 2011, CARN.
[11] David R. Kaeli,et al. Exploring Novel Parallelization Technologies for 3-D Imaging Applications , 2007, 19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07).
[12] Majid Sarrafzadeh,et al. Energy-aware high performance computing with graphic processing units , 2008, CLUSTER 2008.
[13] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[14] David R. Kaeli,et al. Quantifying the energy efficiency of FFT on heterogeneous platforms , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[15] Xipeng Shen,et al. A cross-input adaptive framework for GPU program optimizations , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[16] Mahmut T. Kandemir,et al. The design and use of simplePower: a cycle-accurate energy estimation tool , 2000, Proceedings 37th Design Automation Conference.
[17] Jean-Yves Blanc,et al. Imaging Earth ’ s Subsurface Using CUDA , 2007 .
[18] Wen-mei W. Hwu,et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.
[19] Satoshi Matsuoka,et al. Statistical power modeling of GPU kernels using performance counters , 2010, International Conference on Green Computing.
[20] Martin Vetterli,et al. Fast Fourier transforms: a tutorial review and a state of the art , 1990 .
[21] Piotr Indyk,et al. Faster GPS via the sparse fourier transform , 2012, Mobicom '12.
[22] David R. Kaeli,et al. Architecture-aware optimization targeting multithreaded stream computing , 2009, GPGPU-2.
[23] Donggang Liu,et al. Combating side-channel attacks using key management , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[24] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[25] Collin McCurdy,et al. The Scalable Heterogeneous Computing (SHOC) benchmark suite , 2010, GPGPU-3.
[26] Dong Li,et al. The tradeoffs of fused memory hierarchies in heterogeneous computing architectures , 2012, CF '12.
[27] Satoshi Matsuoka,et al. Bandwidth intensive 3-D FFT kernel for GPUs using CUDA , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[28] Arnaud Tisserand,et al. Power Consumption of GPUs from a Software Perspective , 2009, ICCS.
[29] Naga K. Govindaraju,et al. High performance discrete Fourier transforms on graphics processors , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[30] David Kaeli,et al. Heterogeneous Computing with OpenCL , 2011 .
[31] Hyesoon Kim,et al. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.
[32] V. Volkov,et al. Fitting FFT onto the G 80 Architecture , 2008 .
[33] Matt Pharr,et al. Gpu gems 2: programming techniques for high-performance graphics and general-purpose computation , 2005 .
[34] Haoran Yi,et al. How GPUs Can Improve the Quality of Magnetic Resonance Imaging , 2011 .
[35] John D. Owens,et al. GPU Computing , 2008, Proceedings of the IEEE.
[36] G. D. Peterson,et al. Power Aware Computing on GPUs , 2012, 2012 Symposium on Application Accelerators in High Performance Computing.