Assessing the cost of redistribution followed by a computational kernel: Complexity and performance results
暂无分享,去创建一个
Thomas Hérault | Yves Robert | George Bosilca | Jack J. Dongarra | Julien Herrmann | Loris Marchal | J. Dongarra | G. Bosilca | Y. Robert | T. Hérault | L. Marchal | J. Herrmann | Julien Herrmann
[1] Z Liu,et al. Scheduling Theory and its Applications , 1997 .
[2] David S. Johnson,et al. Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .
[3] Richard M. Karp,et al. A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.
[4] Tsan-sheng Hsu,et al. Task Allocation on a Network of Processors , 2000, IEEE Trans. Computers.
[5] H. Ali,et al. Task Scheduling in Multiprocessing Systems , 1995, Computer.
[6] Thomas Hérault,et al. Determining the Optimal Redistribution for a Given Data Partition , 2014, 2014 IEEE 13th International Symposium on Parallel and Distributed Computing.
[7] Tsan-sheng Hsu,et al. Scheduling Problems in a Practical Allocation Model , 1997, J. Comb. Optim..
[8] Geoffrey C. Fox,et al. Runtime array redistribution in HPF programs , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[9] Yves Robert,et al. Scheduling Block-Cyclic Array Redistribution , 1998, IEEE Trans. Parallel Distributed Syst..
[10] Bernard Tourancheau,et al. Efficient Block Cyclic Data Redistribution , 1996, Euro-Par, Vol. I.
[11] G. Smith,et al. Numerical Solution of Partial Differential Equations: Finite Difference Methods , 1978 .
[12] Michael G. Norman,et al. Models of machines and computation for mapping in multicomputers , 1993, CSUR.
[13] Jack J. Dongarra,et al. Software Libraries for Linear Algebra Computations on High Performance Computers , 1995, SIAM Rev..
[14] R. Noyé,et al. Numerical Solutions of Partial Differential Equations , 1983 .
[15] Yves Robert,et al. A realistic model and an efficient heuristic for scheduling with heterogeneous processors , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.
[16] Yi Pan,et al. Improving communication scheduling for array redistribution , 2005, J. Parallel Distributed Comput..
[17] Thomas Hérault,et al. Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[18] Julien Langou,et al. A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures , 2007, Parallel Comput..
[19] Monika Richter. Scheduling And Load Balancing In Parallel And Distributed Systems , 2016 .
[20] Edward G. Coffman,et al. Scheduling File Transfers , 1985, SIAM J. Comput..
[21] Joseph Hall,et al. Algorithms for Data Migration , 2008, Algorithmica.
[22] Yoo-Ah Kim,et al. Data migration to minimize the total completion time , 2005, J. Algorithms.
[23] Shamkant B. Navathe,et al. Scheduling data redistribution in distributed databases , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.
[24] Richard M. Karp,et al. A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.
[25] Lionel M. Ni,et al. Processor Mapping Techniques Toward Efficient Data Redistribution , 1995, IEEE Trans. Parallel Distributed Syst..
[26] Viktor K. Prasanna,et al. Efficient collective communication in distributed heterogeneous systems , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).
[27] Thomas Hérault,et al. DAGuE: A Generic Distributed DAG Engine for High Performance Computing , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[28] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .
[29] Robert A. van de Geijn,et al. Programming matrix algorithms-by-blocks for thread-level parallelism , 2009, TOMS.
[30] Guy L. Steele,et al. The High Performance Fortran Handbook , 1993 .
[31] David W. Walker,et al. Redistribution of block-cyclic data distributions using MPI , 1996, Concurr. Pract. Exp..
[32] Alexander Schrijver,et al. Combinatorial optimization. Polyhedra and efficiency. , 2003 .
[33] Michael Stonebraker,et al. SciDB DBMS Research at M.I.T , 2013, IEEE Data Eng. Bull..
[34] James Demmel,et al. ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[35] Jan Mayer,et al. A numerical evaluation of preprocessing and ILU-type preconditioners for the solution of unsymmetric sparse linear systems using iterative methods , 2009, TOMS.
[36] Lei Wang,et al. Runtime Performance of Parallel Array Assignment: An Empirical Study , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.