Profiling Halide DSL with CPU Performance Events for Schedule Optimization
暂无分享,去创建一个
[1] Uday Bondhugula,et al. PolyMage: Automatic Optimization for Image Processing Pipelines , 2015, ASPLOS.
[2] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.
[3] Gerhard Wellein,et al. LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments , 2010, 2010 39th International Conference on Parallel Processing Workshops.
[4] Michael Goesele,et al. Guided profiling for auto-tuning array layouts on GPUs , 2015, PMBS '15.
[5] Jonathan Ragan-Kelley,et al. Automatically scheduling halide image processing pipelines , 2016, ACM Trans. Graph..
[6] Frédo Durand,et al. Decoupling algorithms from schedules for easy optimization of image processing pipelines , 2012, ACM Trans. Graph..
[7] Jack J. Dongarra,et al. Collecting Performance Data with PAPI-C , 2009, Parallel Tools Workshop.
[8] Michael Goesele,et al. Adaptive GPU Array Layout Auto-Tuning , 2016, SEM4HPC@HPDC.
[9] Shirley Moore,et al. Measuring Energy and Power with PAPI , 2012, 2012 41st International Conference on Parallel Processing Workshops.
[10] Nuno Roma,et al. Multi-kernel Auto-Tuning on GPUs: Performance and Energy-Aware Optimization , 2015, 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.
[11] Gerhard Wellein,et al. LIKWID: Lightweight Performance Tools , 2011, CHPC.
[12] Jack J. Dongarra,et al. A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..