Network-Oblivious Algorithms
暂无分享,去创建一个
Geppino Pucci | Andrea Pietracaprina | Gianfranco Bilardi | Michele Scquizzato | Francesco Silvestri | G. Bilardi | A. Pietracaprina | G. Pucci | Francesco Silvestri
[1] A G WijshoffHarry,et al. A quantitative comparison of parallel computation models , 1998 .
[2] Dror Irony,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004, J. Parallel Distributed Comput..
[3] Irving L. Traiger,et al. Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..
[4] Geppino Pucci,et al. Area-time tradeoffs for universal VLSI circuits , 2008, Theor. Comput. Sci..
[5] Geppino Pucci,et al. Area-universal circuits with constant slowdown , 1999, Proceedings 20th Anniversary Conference on Advanced Research in VLSI.
[6] Geppino Pucci,et al. Network-Oblivious Algorithms , 2007, IPDPS.
[7] F. Thomson Leighton,et al. ARRAYS AND TREES , 1992 .
[8] Charles E. Leiserson,et al. Cache-Oblivious Algorithms , 2003, CIAC.
[9] Geppino Pucci,et al. A Quantitative Measure of Portability with Application to Bandwidth-Latency Models for Parallel Computing , 1999, Euro-Par.
[10] F. Leighton,et al. Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .
[11] Dan Suciu,et al. Journal of the ACM , 2006 .
[12] Bruce M. Maggs,et al. Communication-efficient parallel algorithms for distributed random-access machines , 1988, Algorithmica.
[13] Yossi Matias,et al. Can shared-memory model serve as a bridging model for parallel computation? , 1997, SPAA '97.
[14] Ben H. H. Juurlink,et al. A quantitative comparison of parallel computation models , 1996, SPAA '96.
[15] Gerth Stølting Brodal,et al. On the limits of cache-obliviousness , 2003, STOC '03.
[16] Friedhelm Meyer auf der Heide,et al. Truly Efficient Parallel Algorithms: 1-optimal Multisearch for an Extension of the BSP Model , 1998, Theor. Comput. Sci..
[17] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[18] Charles E. Leiserson,et al. Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.
[19] John E. Savage,et al. Models of computation - exploring the power of computing , 1998 .
[20] F. P. Preparata,et al. Processor—Time Tradeoffs under Bounded-Speed Message Propagation: Part I, Upper Bounds , 1995, Theory of Computing Systems.
[21] Vijaya Ramachandran,et al. Cache-efficient dynamic programming algorithms for multicores , 2008, SPAA '08.
[22] Francesco Silvestri,et al. On the Limits of Cache-Oblivious Matrix Transposition , 2006, TGC.
[23] Geppino Pucci,et al. Decomposable BSP: A Bandwidth-Latency Model for Parallel and Hierarchical Computation , 2007 .
[24] Geppino Pucci,et al. Cache-oblivious simulation of parallel programs , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[25] Alok Aggarwal,et al. Hierarchical memory with block transfer , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).
[26] Yossi Matias,et al. Can shared-memory model serve as a bridging model for parallel computation? , 1997, SPAA '97.
[27] Alok Aggarwal,et al. Communication Complexity of PRAMs , 1990, Theor. Comput. Sci..
[28] Frank Thomson Leighton,et al. Tight Bounds on the Complexity of Parallel Sorting , 1985, IEEE Trans. Computers.
[29] Bowen Alpern,et al. A model for hierarchical memory , 1987, STOC.
[30] Francesco Silvestri,et al. On the limits of cache-oblivious rational permutations , 2008, Theor. Comput. Sci..
[31] Gianfranco Bilardi,et al. A Characterization of Temporal Locality and Its Portability across Memory Hierarchies , 2001, ICALP.
[32] Clyde P. Kruskal,et al. Submachine Locality in the Bulk Synchronous Setting (Extended Abstract) , 1996, Euro-Par, Vol. II.
[33] ToledoSivan,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004 .
[34] Frank Thomson Leighton. Introduction to parallel algorithms and architectures: arrays , 1992 .
[35] Franco P. Preparata,et al. Processor—Time Tradeoffs under Bounded-Speed Message Propagation: Part II, Lower Bounds , 1999, Theory of Computing Systems.
[36] Joseph JáJá,et al. An Introduction to Parallel Algorithms , 1992 .
[37] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[38] Franco P. Preparata,et al. Horizons of Parallel Computation , 1992, J. Parallel Distributed Comput..
[39] Mithuna Thottethodi,et al. Recursive Array Layouts and Fast Matrix Multiplication , 2002, IEEE Trans. Parallel Distributed Syst..
[40] Leslie G. Valiant. A Bridging Model for Multi-core Computing , 2008, ESA.
[41] Alexander Tiskin,et al. The Bulk-Synchronous Parallel Random Access Machine , 1996, Theor. Comput. Sci..
[42] S. Sitharama Iyengar,et al. Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.
[43] Michael T. Goodrich,et al. Communication-Efficient Parallel Sorting , 1999, SIAM J. Comput..
[44] Alok Aggarwal,et al. The input/output complexity of sorting and related problems , 1988, CACM.
[45] Frank Thomson Leighton,et al. Tight Bounds on the Complexity of Parallel Sorting , 1984, IEEE Transactions on Computers.
[46] Volker Strumpen,et al. The Cache Complexity of Multithreaded Cache Oblivious Algorithms , 2009, SPAA '06.
[47] Volker Strumpen,et al. Cache oblivious stencil computations , 2005, ICS '05.
[48] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[49] L. R. Kerr. The Effect of Algebraic Structure on the Computational Complexity of Matrix Multiplication , 1970 .
[50] Ramesh Subramonian,et al. LogP: a practical model of parallel computation , 1996, CACM.