Tools for machine-learning-based empirical autotuning and specialization
暂无分享,去创建一个
Allen D. Malony | Scott Biersdorff | Nicholas Chaimov | A. Malony | Nicholas Chaimov | Scott Biersdorff
[1] Jie Wang,et al. Optimizing MPI Runtime Parameter Settings by Using Machine Learning , 2009, PVM/MPI.
[2] Allen D. Malony,et al. The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..
[3] P. Sadayappan,et al. Annotation-based empirical performance tuning using Orio , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[4] Chun Chen,et al. Auto-tuning full applications: A case study , 2011, Int. J. High Perform. Comput. Appl..
[5] Chun Chen,et al. Speeding up Nek5000 with autotuning and specialization , 2010, ICS '10.
[6] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.
[7] Mark Stephenson,et al. Predicting unroll factors using supervised classification , 2005, International Symposium on Code Generation and Optimization.
[8] Michael F. P. O'Boyle,et al. Rapidly Selecting Good Compiler Optimizations using Performance Counters , 2007, International Symposium on Code Generation and Optimization (CGO'07).
[9] Chun Chen,et al. A Programming Language Interface to Describe Transformations and Code Generation , 2010, LCPC.
[10] Mary W. Hall,et al. CHiLL : A Framework for Composing High-Level Loop Transformations , 2007 .
[11] Archana Ganapathi,et al. A case for machine learning to optimize multicore performance , 2009 .
[12] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[13] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[14] Allen D. Malony,et al. An experimental approach to performance measurement of heterogeneous parallel applications using CUDA , 2010, ICS '10.
[15] Ananta Tiwari,et al. Online Adaptive Code Generation and Tuning , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[16] Richard W. Vuduc,et al. Effective Source-to-Source Outlining to Support Whole Program Empirical Optimization , 2009, LCPC.
[17] Ananta Tiwari,et al. End-to-End Auto-Tuning with Active Harmony , 2010 .
[18] Allen D. Malony,et al. Design and implementation of a parallel performance data management framework , 2005, 2005 International Conference on Parallel Processing (ICPP'05).
[19] P. Sadayappan,et al. Optimal loop unrolling for GPGPU programs , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[20] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[21] Vahid Tabatabaee,et al. Tuning parallel applications in parallel , 2009, Parallel Comput..
[22] Allen D. Malony,et al. ParaProf: A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis , 2003, Euro-Par.
[23] D. Qainlant,et al. ROSE: Compiler Support for Object-Oriented Frameworks , 1999 .
[24] Jack J. Dongarra,et al. A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..
[25] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[26] Allen D. Malony,et al. PerfExplorer: A Performance Data Mining Framework For Large-Scale Parallel Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[27] Bernd Mohr,et al. A Tool Framework for Static and Dynamic Analysis of Object-Oriented Software with Templates , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[28] François Bodin,et al. A Machine Learning Approach to Automatic Production of Compiler Heuristics , 2002, AIMSA.