Automated Timer Generation for Empirical Tuning ⋆
暂无分享,去创建一个
[1] Franz Franchetti,et al. SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.
[2] Mark Stephenson,et al. Predicting unroll factors using supervised classification , 2005, International Symposium on Code Generation and Optimization.
[3] Alan Jay Smith,et al. Measuring Cache and TLB Performance and Their Effect on Benchmark Runtimes , 1995, IEEE Trans. Computers.
[4] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[5] Ken Kennedy,et al. Automatic tuning of whole applications using direct search and a performance-based transformation system , 2006, The Journal of Supercomputing.
[6] R. Clint Whaley,et al. Achieving accurate and context‐sensitive timing for code optimization , 2008, Softw. Pract. Exp..
[7] Robert J. Fowler,et al. HPCVIEW: A Tool for Top-down Analysis of Node Performance , 2002, The Journal of Supercomputing.
[8] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[9] Katherine Yelick,et al. OSKI: A library of automatically tuned sparse matrix kernels , 2005 .
[10] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[11] R. C. Whaley,et al. Timing high performance kernels through empirical compilation , 2005, 2005 International Conference on Parallel Processing (ICPP'05).
[12] Chun Chen,et al. Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy , 2005, International Symposium on Code Generation and Optimization.
[13] James Demmel,et al. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.
[14] Richard W. Vuduc,et al. POET: Parameterized Optimizations for Empirical Tuning , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[15] Michael F. P. O'Boyle,et al. Using machine learning to focus iterative optimization , 2006, International Symposium on Code Generation and Optimization (CGO'06).
[16] Rudolf Eigenmann,et al. Fast, automatic, procedure-level performance tuning , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[17] Gang Ren,et al. A comparison of empirical and model-driven optimization , 2003, PLDI '03.
[18] K. Yotov,et al. X-ray: a tool for automatic measurement of hardware parameters , 2005, Second International Conference on the Quantitative Evaluation of Systems (QEST'05).
[19] R. C. Whaley,et al. Automated transformation for performance-critical kernels , 2007, LCSD '07.