Tradeoffs between synchronization, communication, and computation in parallel linear algebra computations
暂无分享,去创建一个
[1] F. P. Preparata,et al. Processor—Time Tradeoffs under Bounded-Speed Message Propagation: Part I, Upper Bounds , 1995, Theory of Computing Systems.
[2] Alexander Tiskin. Communication-efficient parallel generic pairwise elimination , 2007, Future Gener. Comput. Syst..
[3] H. T. Kung,et al. I/O complexity: The red-blue pebble game , 1981, STOC '81.
[4] Alok Aggarwal,et al. Communication Complexity of PRAMs , 1990, Theor. Comput. Sci..
[5] Jack Dongarra,et al. ScaLAPACK user's guide , 1997 .
[6] V. Strassen. Gaussian elimination is not optimal , 1969 .
[7] Michael T. Heath,et al. Parallel solution of triangular systems on distributed-memory multiprocessors , 1988 .
[8] James Demmel,et al. Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms , 2011, Euro-Par.
[9] Dror Irony,et al. TRADING REPLICATION FOR COMMUNICATION IN PARALLEL DISTRIBUTED-MEMORY DENSE SOLVERS , 2002 .
[10] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[11] Alexander Tiskin,et al. All-Pairs Shortest Paths Computation in the BSP Model , 2001, ICALP.
[12] Evripidis Bampis,et al. Optimal Schedules for d-D Grid Graphs with Communication Delays , 1998, Parallel Comput..
[13] James Demmel,et al. Brief announcement: strong scaling of matrix multiplication algorithms and memory-independent communication lower bounds , 2012, SPAA '12.
[14] Dror Irony,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004, J. Parallel Distributed Comput..
[15] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[16] Michael A. Bender,et al. Optimal Sparse Matrix Dense Vector Multiplication in the I/O-Model , 2007, SPAA '07.
[17] James Demmel,et al. Improving communication performance in dense linear algebra via topology aware collectives , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[18] Alexander Tiskin,et al. Memory-Efficient Matrix Multiplication in the BSP Model , 1999, Algorithmica.
[19] James Demmel,et al. Minimizing Communication in Numerical Linear Algebra , 2009, SIAM J. Matrix Anal. Appl..
[20] Danny C. Sorensen,et al. Analysis of Pairwise Pivoting in Gaussian Elimination , 1985, IEEE Transactions on Computers.
[21] Christos H. Papadimitriou,et al. A Communication-Time Tradeoff , 1987, SIAM J. Comput..
[22] Optimal Schedules for d-D Grid Graphs with Communication Delays (Extended Abstract) , 1996, STACS.
[23] H. Whitney,et al. An inequality related to the isoperimetric inequality , 1949 .
[24] Robert A. van de Geijn,et al. Elemental: A New Framework for Distributed Memory Dense Matrix Computations , 2013, TOMS.
[25] Stephen Warshall,et al. A Theorem on Boolean Matrices , 1962, JACM.