GPU performance prediction using parametrized models
暂无分享,去创建一个
[1] Flemming Nielson,et al. Principles of Program Analysis , 1999, Springer Berlin Heidelberg.
[2] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[3] Subhash Saini,et al. Performance prediction and its use in parallel and distributed computing systems , 2006, Future Gener. Comput. Syst..
[4] Rudolf Eigenmann,et al. Fast and effective orchestration of compiler optimizations for automatic performance tuning , 2006, International Symposium on Code Generation and Optimization (CGO'06).
[5] Laszlo A. Belady,et al. A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..
[6] R. C. Whaley,et al. Timing high performance kernels through empirical compilation , 2005, 2005 International Conference on Parallel Processing (ICPP'05).
[7] H. Rice. Classes of recursively enumerable sets and their decision problems , 1953 .
[8] Stephen McCamant,et al. The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..
[9] Hyesoon Kim,et al. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.
[10] Eric A. Brewer,et al. PROTEUS: a high-performance parallel-architecture simulator , 1992, SIGMETRICS '92/PERFORMANCE '92.
[11] William Gropp,et al. An adaptive performance modeling tool for GPU architectures , 2010, PPoPP '10.
[12] John M. Mellor-Crummey,et al. Cross-architecture performance predictions for scientific applications using parameterized models , 2004, SIGMETRICS '04/Performance '04.
[13] Sally A. McKee,et al. An Approach to Performance Prediction for Parallel Applications , 2005, Euro-Par.
[14] Nicholas Nethercote,et al. Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.
[15] M. J. Quinn,et al. Analytical performance prediction on multicomputers , 1993, Supercomputing '93.
[16] D. V. Sidorov,et al. The use of dynamic analysis for generation of input data that demonstrates critical bugs and vulnerabilities in programs , 2010, Programming and Computer Software.
[17] Carl Staelin,et al. lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.
[18] Sudhakar Yalamanchili,et al. A characterization and analysis of PTX kernels , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[19] David Parello,et al. Barra, a Modular Functional GPU Simulator for GPGPU , 2009 .
[20] Sudhakar Yalamanchili,et al. Modeling GPU-CPU workloads and systems , 2010, GPGPU-3.
[21] David A. Padua,et al. The Power of Belady?s Algorithm in Register Allocation for Long Basic Blocks , 2003, LCPC.
[22] Yao Zhang,et al. A quantitative performance analysis model for GPU architectures , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[23] Sartaj Sahni,et al. Performance metrics: keeping the focus on runtime , 1996, IEEE Parallel Distributed Technol. Syst. Appl..
[24] Keith D. Cooper,et al. An Experimental Evaluation of List Scheduling , 1998 .
[25] K. Srinathan,et al. A performance prediction model for the CUDA GPGPU platform , 2009, 2009 International Conference on High Performance Computing (HiPC).
[26] Mark N. Wegman,et al. Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.