Exploring a Novel Gathering Method for Finite Element Codes on the Cell/B.E. Architecture
暂无分享,去创建一个
[1] Guillaume Houzeaux,et al. Porting to Cell/B.E. the Alya System, a High Performance Computational Mechanics Code , 2010 .
[2] G. Fasshauer. Meshfree Methods , 2004 .
[3] Vipin Kumar,et al. Multilevel k-way hypergraph partitioning , 1999, DAC '99.
[4] Vipin Kumar,et al. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..
[5] Guohua Jin,et al. Using Space-filling Curves for Computation Reordering , 2005 .
[6] Martin J. Dürst,et al. The design and analysis of spatial data structures. Applications of spatial data structures: computer graphics, image processing, and GIS , 1991 .
[7] Ken Kennedy,et al. Improving memory hierarchy performance for irregular applications , 1999, ICS '99.
[8] Larry Carter,et al. Localizing non-affine array references , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[9] H. Peter Hofstee,et al. Power efficient processor architecture and the cell processor , 2005, 11th International Symposium on High-Performance Computer Architecture.
[10] Shahid H. Bokhari,et al. A Partitioning Strategy for PDEs Across Multiprocessors , 1985, ICPP.
[11] Juan J. Navarro,et al. Data prefetching and multilevel blocking for linear algebra operations , 1996, ICS '96.
[12] Shahid H. Bokhari,et al. A Partitioning Strategy for Nonuniform Problems on Multiprocessors , 1987, IEEE Transactions on Computers.
[13] Miriam Mehl,et al. A Cache-Aware Algorithm for PDEs on Hierarchical Data Structures Based on Space-Filling Curves , 2006, SIAM J. Sci. Comput..
[14] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[15] Chau-Wen Tseng,et al. A Comparison of Locality Transformations for Irregular Codes , 2000, LCR.
[16] John Robinson,et al. Introduction to the S-adaptivity method , 1997 .
[17] Daniel Jiménez-González,et al. Performance Analysis of Cell Broadband Engine for High Memory Bandwidth Applications , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.
[18] Michael S. Warren,et al. A parallel hashed oct-tree N-body algorithm , 1993, Supercomputing '93. Proceedings.
[19] Chau-Wen Tseng,et al. Improving Locality for Adaptive Irregular Scientific Codes , 2000, LCPC.
[20] Fabrizio Petrini,et al. Cell Multiprocessor Communication Network: Built for Speed , 2006, IEEE Micro.
[21] Scott M. Murman,et al. Applications of Space-Filling-Curves to Cartesian Methods for CFD , 2004 .
[22] David R. O'Hallaron,et al. Languages, Compilers and Run-Time Systems for Scalable Computers , 1998, Springer US.
[23] Sanjay Ranka,et al. Memory hierarchy management for iterative graph structures , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.
[24] Samuel P. Midkiff,et al. Efficient high performance collective communication for the cell blade , 2009, ICS '09.
[25] Ken Kennedy,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.
[26] Toni Cortes,et al. PARAVER: A Tool to Visualize and Analyze Parallel Code , 2007 .
[27] KennedyKen,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999 .
[28] Vipin Kumar,et al. Multilevel k-way Hypergraph Partitioning , 2000, VLSI Design.