Exploiting Performance Portability in Search Algorithms for Autotuning
暂无分享,去创建一个
[1] Stephen J. Wright,et al. Warm-Start Strategies in Interior-Point Methods for Linear Programming , 2002, SIAM J. Optim..
[2] Shoaib Kamil,et al. OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[3] Michael F. P. O'Boyle,et al. A large-scale cross-architecture evaluation of thread-coarsening , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[4] David A. Padua,et al. A Language for the Compact Representation of Multiple Program Versions , 2005, LCPC.
[5] Graham R. Nudd,et al. Pace—A Toolset for the Performance Prediction of Parallel and Distributed Systems , 2000, Int. J. High Perform. Comput. Appl..
[6] John Cavazos,et al. Intelligent compilers , 2008, 2008 IEEE International Conference on Cluster Computing.
[7] Michael Garland,et al. Nitro: A Framework for Adaptive Code Variant Tuning , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[8] Prasanna Balaprakash,et al. SPAPT: Search Problems in Automatic Performance Tuning , 2012, ICCS.
[9] Prasanna Balaprakash,et al. Machine-Learning-Based Load Balancing for Community Ice Code Component in CESM , 2014, VECPAR.
[10] Sameer Kulkarni,et al. An evaluation of different modeling techniques for iterative compilation , 2011, 2011 Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES).
[11] Jack J. Dongarra,et al. A comparison of search heuristics for empirical code optimization , 2008, 2008 IEEE International Conference on Cluster Computing.
[12] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[13] Chun Chen,et al. Model-guided empirical optimization for memory hierarchy , 2007 .
[14] Prasanna Balaprakash,et al. An Experimental Study of Global and Local Search Algorithms in Empirical Performance Tuning , 2012, VECPAR.
[15] Keith D. Cooper,et al. ACME: adaptive compilation made efficient , 2005, LCTES '05.
[16] Michael F. P. O'Boyle,et al. MILEPOST GCC: machine learning based research compiler , 2008 .
[17] Prasanna Balaprakash,et al. Generating Efficient Tensor Contractions for GPUs , 2015, 2015 44th International Conference on Parallel Processing.
[18] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[19] Venkatram Vishwanath,et al. GROPHECY: GPU performance projection from CPU code skeletons , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[20] P. Sadayappan,et al. Annotation-based empirical performance tuning using Orio , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[21] Prasanna Balaprakash,et al. Active-learning-based surrogate models for empirical performance tuning , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).
[22] David A. Padua,et al. Compile-Time Based Performance Prediction , 1999, LCPC.
[23] William J. Dally,et al. A tuning framework for software-managed memory hierarchies , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[24] Grigori Fursin,et al. Probabilistic source-level optimisation of embedded programs , 2005, LCTES '05.
[25] Steffen Becker,et al. Model-Based performance prediction with the palladio component model , 2007, WOSP '07.
[26] D. Merrill,et al. Policy-based tuning for performance portability and library co-optimization , 2012, 2012 Innovative Parallel Computing (InPar).
[27] Samuel Williams,et al. Performance Tuning of Scientific Applications , 2010 .
[28] Michael F. P. O'Boyle,et al. Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation , 2004, The Journal of Supercomputing.
[29] Venkatram Vishwanath,et al. SKOPE: a framework for modeling and exploring workload behavior , 2014, Conf. Computing Frontiers.
[30] Chun Chen,et al. A scalable auto-tuning framework for compiler optimization , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[31] Mark Stephenson,et al. Predicting unroll factors using supervised classification , 2005, International Symposium on Code Generation and Optimization.
[32] Reuven Y. Rubinstein,et al. Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.
[33] Richard W. Vuduc,et al. POET: Parameterized Optimizations for Empirical Tuning , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.