Workload Partitioning for Accelerating Applications on Heterogeneous Platforms
暂无分享,去创建一个
Jie Shen | Henk J. Sips | Ana Lucia Varbanescu | Yutong Lu | Peng Zou | H. Sips | Yutong Lu | Peng Zou | A. Varbanescu | Jie Shen
[1] Jesús Labarta,et al. A Framework for Performance Modeling and Prediction , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[2] Teresa H. Y. Meng,et al. Merge: a programming model for heterogeneous multi-core systems , 2008, ASPLOS.
[3] Mark D. Hill,et al. Amdahl's Law in the Multicore Era , 2008, Computer.
[4] Hyesoon Kim,et al. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.
[5] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[6] Hyesoon Kim,et al. Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[7] Richard W. Vuduc,et al. Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems , 2009, ICS.
[8] KimHyesoon,et al. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009 .
[9] Surendra Byna,et al. Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory , 2010, SPAA '10.
[10] Murat Efe Guney,et al. On the limits of GPU acceleration , 2010 .
[11] Gagan Agrawal,et al. Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations , 2010, ICS '10.
[12] Pradeep Dubey,et al. Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.
[13] Jack J. Dongarra,et al. Towards dense linear algebra for hybrid GPU accelerated manycore systems , 2009, Parallel Comput..
[14] Jérémie Allard,et al. Multi-GPU and Multi-CPU Parallelization for Interactive Physics Simulations , 2010, Euro-Par.
[15] Jaejin Lee,et al. Performance characterization of the NAS Parallel Benchmarks in OpenCL , 2011, 2011 IEEE International Symposium on Workload Characterization (IISWC).
[16] Alejandro Duran,et al. Ompss: a Proposal for Programming Heterogeneous Multi-Core Architectures , 2011, Parallel Process. Lett..
[17] Michael F. P. O'Boyle,et al. A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL , 2011, CC.
[18] Kiran Kumar Matam,et al. Accelerating Sparse Matrix Vector Multiplication in Iterative Methods Using GPU , 2011, 2011 International Conference on Parallel Processing.
[19] Hendrikus G. Visser,et al. A framework for simulation of aircraft flyover noise through a non-standard atmosphere , 2012 .
[20] Wolfgang Karl,et al. Seamlessly portable applications: Managing the diversity of modern heterogeneous systems , 2012, TACO.
[21] Greg Stitt,et al. The RACECAR heuristic for automatic function specialization on multi-core heterogeneous systems , 2012, CASES '12.
[22] Matei Ripeanu,et al. A yoke of oxen and a thousand chickens for heavy lifting graph processing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[23] Wei Chen,et al. GreenGPU: A Holistic Approach to Energy Efficiency in GPU-CPU Heterogeneous Architectures , 2012, 2012 41st International Conference on Parallel Processing.
[24] Jack J. Dongarra,et al. Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems , 2012, ICS '12.
[25] Jungwon Kim,et al. SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters , 2012, ICS '12.
[26] Kevin Skadron,et al. Load balancing in a changing world: dealing with heterogeneity and performance variability , 2013, CF '13.
[27] Margaret Martonosi,et al. Reducing GPU offload latency via fine-grained CPU-GPU synchronization , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[28] Thomas Fahringer,et al. An automatic input-sensitive approach for heterogeneous task partitioning , 2013, ICS '13.
[29] Matei Ripeanu,et al. On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[30] Jie Shen,et al. An application-centric evaluation of OpenCL on multi-core CPUs , 2013, Parallel Comput..
[31] Jie Shen,et al. Performance Traps in OpenCL for CPUs , 2013, 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.
[32] Basilio B. Fraguela,et al. Exploiting heterogeneous parallelism with the Heterogeneous Programming Library , 2013, J. Parallel Distributed Comput..
[33] Jie Shen,et al. Glinda: a framework for accelerating imbalanced applications on heterogeneous platforms , 2013, CF '13.
[34] Joseph JáJá,et al. High Performance FFT Based Poisson Solver on a CPU-GPU Heterogeneous Platform , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[35] Natalie D. Enright Jerger,et al. DistCL: A Framework for the Distributed Execution of OpenCL Kernels , 2013, 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems.
[36] Jie Shen,et al. Look before You Leap: Using the Right Hardware Resources to Accelerate Applications , 2014, 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS).
[37] Jie Shen,et al. Improving performance by matching imbalanced workloads with heterogeneous platforms , 2014, ICS '14.
[38] Cees T. A. M. de Laat,et al. An Empirical Evaluation of GPGPU Performance Models , 2014, Euro-Par Workshops.
[39] Jason Maassen,et al. Performance Models for CPU-GPU Data Transfers , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[40] Jie Shen,et al. Matchmaking Applications and Partitioning Strategies for Efficient Execution on Heterogeneous Platforms , 2015, 2015 44th International Conference on Parallel Processing.