Are Static Schedules so Bad? A Case Study on Cholesky Factorization
暂无分享,去创建一个
Emmanuel Agullo | Olivier Beaumont | Lionel Eyraud-Dubois | Suraj Kumar | Olivier Beaumont | E. Agullo | Lionel Eyraud-Dubois | Suraj Kumar
[1] Salim Hariri,et al. Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..
[2] Jean-François Méhaut,et al. Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-core Architectures , 2014, Euro-Par.
[3] Eduard Ayguadé,et al. Hierarchical Task-Based Programming With StarSs , 2009, Int. J. High Perform. Comput. Appl..
[4] Emmanuel Agullo,et al. Bridging the Gap between Performance and Bounds of Cholesky Factorization on Heterogeneous Platforms , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.
[5] Robert A. van de Geijn,et al. SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks , 2008, PPoPP.
[6] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[7] Oliver Sinnen,et al. Scheduling task graphs optimally with A* , 2010, The Journal of Supercomputing.
[8] Henri Casanova,et al. SimGrid: A Generic Framework for Large-Scale Distributed Experiments , 2008, Tenth International Conference on Computer Modeling and Simulation (uksim 2008).
[9] Jack Dongarra,et al. QUARK Users' Guide: QUeueing And Runtime for Kernels , 2011 .
[10] Julien Langou,et al. A Critical Path Approach to Analyzing Parallelism of Algorithmic Variants. Application to Cholesky Inversion , 2010, ArXiv.
[11] Jack Dongarra,et al. Faster, Cheaper, Better { a Hybridization Methodology to Develop Linear Algebra Software for GPUs , 2010 .
[12] Julien Langou,et al. A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures , 2007, Parallel Comput..
[13] Eduard Ayguadé,et al. Exploiting asynchrony from exact forward recovery for DUE in iterative solvers , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[14] George Bosilca,et al. PaRSEC : A programming paradigm exploiting heterogeneity for enhancing scalability , 2013 .
[15] George Bosilca,et al. Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project , 2010 .
[16] Emmanuel Agullo,et al. Task-Based FMM for Multicore Architectures , 2014, SIAM J. Sci. Comput..
[17] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[18] Henricus Bouwmeester,et al. Tiled Algorithms for Matrix Computations on Multicore Architectures , 2013, ArXiv.
[19] Philippe Baptiste,et al. Constraint - based scheduling : applying constraint programming to scheduling problems , 2001 .
[20] Robert A. van de Geijn,et al. The libflame Library for Dense Matrix Computations , 2009, Computing in Science & Engineering.
[21] Ronald L. Graham,et al. Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.