Design and Optimization of Scientific Applications for Highly Heterogeneous and Hierarchical HPC Platforms Using Functional Computation Performance Models
暂无分享,去创建一个
Emmanuel Jeannot | Leonel Sousa | Ziming Zhong | Aleksandar Ilic | Julius Žilinskas | David Clarke | Vladimir Rychkov | Alexey Lastovetsky
[1] Frédéric Wagner,et al. Hierarchical Work-Stealing , 2010, Euro-Par.
[2] Kai Lu,et al. Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing , 2010, 2010 IEEE International Conference on Cluster Computing.
[3] Teresa H. Y. Meng,et al. Merge: a programming model for heterogeneous multi-core systems , 2008, ASPLOS.
[4] Robert A. van de Geijn,et al. Solving dense linear systems on platforms with multiple hardware accelerators , 2009, PPoPP '09.
[5] Robert D. Blumofe,et al. Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.
[6] Ziming Zhong,et al. Data Partitioning on Heterogeneous Multicore Platforms , 2011, 2011 IEEE International Conference on Cluster Computing.
[7] Alexey L. Lastovetsky,et al. Dynamic Load Balancing of Parallel Computational Iterative Routines on Platforms with Memory Heterogeneity , 2010, Euro-Par Workshops.
[8] Antonio J. Plaza,et al. Automatic tuning of iterative computation on heterogeneous multiprocessors with ADITHE , 2011, The Journal of Supercomputing.
[9] Jaeyoung Choi,et al. A new parallel matrix multiplication algorithm on distributed‐memory concurrent computers , 1998 .
[10] Leonel Sousa,et al. On Realistic Divisible Load Scheduling in Highly Heterogeneous Distributed Systems , 2012, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing.
[11] Jaeyoung Choi,et al. Pumma: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers , 1994, Concurr. Pract. Exp..
[12] Jeanette P. Schmidt,et al. Load-sharing in heterogeneous systems via weighted factoring , 1996, SPAA '96.
[13] Jack J. Dongarra,et al. Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems , 2012, ICS '12.
[14] Francisco Almeida,et al. Dynamic Load Balancing on Dedicated Heterogeneous Systems , 2008 .
[15] Thomas Hérault,et al. Hierarchical QR Factorization Algorithms for Multi-core Cluster Systems , 2011, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[16] Leonel Sousa,et al. Collaborative execution environment for heterogeneous parallel systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[17] Satoshi Matsuoka,et al. An efficient, model-based CPU-GPU heterogeneous FFT library , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[18] Mohammed J. Zaki,et al. Compile-Time Scheduling Algorithms for a Heterogeneous Network of Workstations , 1997, Comput. J..
[19] Leonel Sousa,et al. Scheduling Divisible Loads on Heterogeneous Desktop Systems with Limited Memory , 2011, Euro-Par Workshops.
[20] Cédric Augonnet,et al. Automatic Calibration of Performance Models on Heterogeneous Multicore Architectures , 2009, Euro-Par Workshops.
[21] Leonel Sousa,et al. Hierarchical Partitioning Algorithm for Scientific Computing on Highly Heterogeneous CPU + GPU Clusters , 2012, Euro-Par.
[22] Kim M. Hazelwood,et al. Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.
[23] Yves Robert,et al. Mapping and load-balancing iterative computations , 2004, IEEE Transactions on Parallel and Distributed Systems.
[24] Yves Robert,et al. Matrix Multiplication on Heterogeneous Platforms , 2001, IEEE Trans. Parallel Distributed Syst..
[25] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[26] Hyesoon Kim,et al. Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[27] Alexey L. Lastovetsky,et al. Column-Based Matrix Partitioning for Parallel Matrix Multiplication on Heterogeneous Processors Based on Functional Performance Models , 2011, Euro-Par Workshops.
[28] Alexey L. Lastovetsky,et al. Heterogeneous Distribution of Computations Solving Linear Algebra Problems on Networks of Heterogeneous Computers , 2001, J. Parallel Distributed Comput..
[29] Jaeyoung Choi. A new parallel matrix multiplication algorithm on distributed-memory concurrent computers , 1998, Concurr. Pract. Exp..
[30] Alexey L. Lastovetsky,et al. Data Partitioning with a Functional Performance Model of Heterogeneous Processors , 2007, Int. J. High Perform. Comput. Appl..
[31] Leonel Sousa,et al. Simultaneous Multi-Level Divisible Load Balancing for Heterogeneous Desktop Systems , 2012, 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications.
[32] Ziming Zhong,et al. Data Partitioning on Heterogeneous Multicore and Multi-GPU Systems Using Functional Performance Models of Data-Parallel Applications , 2012, 2012 IEEE International Conference on Cluster Computing.
[33] Francisco Almeida,et al. Dynamic Load Balancing on Dedicated Heterogeneous Systems , 2008, PVM/MPI.
[34] Massimiliano Fatica. Accelerating linpack with CUDA on heterogenous clusters , 2009, GPGPU-2.