Automatic Performance Tuning of Pipeline Patterns for Heterogeneous Parallel Architectures
暂无分享,去创建一个
[1] Christoph W. Kessler,et al. SkePU: a multi-backend skeleton programming library for multi-GPU systems , 2010, HLPP '10.
[2] Xipeng Shen,et al. A cross-input adaptive framework for GPU program optimizations , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[3] I-Hsin Chung,et al. Using Information from Prior Runs to Improve Automated Tuning Systems , 2004, Proceedings of the ACM/IEEE SC2004 Conference.
[4] Cédric Augonnet,et al. PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems , 2011, IEEE Micro.
[5] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[6] Michael Gerndt,et al. PERISCOPE: An Online-Based Distributed Performance Analysis Tool , 2009, Parallel Tools Workshop.
[7] Anna Sikora,et al. AutoTune: A Plugin-Driven Approach to the Automatic Tuning of Parallel Applications , 2012, PARA.
[8] Salim Hariri,et al. Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..
[9] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[10] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[11] Siegfried Benkner,et al. Automatic Tuning of a Parallel Pattern Library for Heterogeneous Systems with Intel Xeon Phi , 2014, 2014 IEEE International Symposium on Parallel and Distributed Processing with Applications.
[12] Timothy G. Mattson,et al. Patterns for parallel programming , 2004 .
[13] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.
[14] Wen-mei W. Hwu,et al. Program optimization space pruning for a multithreaded gpu , 2008, CGO '08.
[15] Alex Zelinsky,et al. Learning OpenCV---Computer Vision with the OpenCV Library (Bradski, G.R. et al.; 2008)[On the Shelf] , 2009, IEEE Robotics & Automation Magazine.
[16] Samuel Thibault,et al. High-Level Support for Pipeline Parallelism on Many-Core Architectures , 2012, Euro-Par.
[17] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[18] Siegfried Benkner,et al. Using explicit platform descriptions to support programming of heterogeneous many-core systems , 2012, Parallel Comput..
[19] Nathan Bell,et al. Thrust: A Productivity-Oriented Library for CUDA , 2012 .
[20] Kristina Lerman,et al. Model-guided performance tuning of parameter values: A case study with molecular dynamics visualization , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[21] Siegfried Benkner,et al. HyPHI - Task Based Hybrid Execution C++ Library for the Intel Xeon Phi Coprocessor , 2013, 2013 42nd International Conference on Parallel Processing.
[22] Peter M. W. Knijnenburg,et al. Automatic selection of compiler options using non-parametric inferential statistics , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[23] David I. August,et al. Compiler optimization-space exploration , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..