论文信息 - An Overview of Topology Mapping Algorithms and Techniques in High‐Performance Computing - 字舞流文

An Overview of Topology Mapping Algorithms and Techniques in High‐Performance Computing

Torsten Hoefler | Emmanuel Jeannot | Guillaume Mercier | Julius Žilinskas | T. Hoefler | E. Jeannot | Guillaume Mercier | J. Žilinskas

[1] Toshiyuki Shimizu,et al. Tofu: A 6D Mesh/Torus Interconnect for Exascale Computers , 2009, Computer.

[2] Franck Cappello,et al. The International Exascale Software Project: a Call To Cooperative Action By the Global High-Performance Community , 2009, Int. J. High Perform. Comput. Appl..

[3] Emmanuel Jeannot,et al. Near-Optimal Placement of MPI Processes on Hierarchical NUMA Architectures , 2010, Euro-Par.

[4] Laxmikant V. Kalé,et al. Dynamic topology aware load balancing algorithms for molecular dynamics applications , 2009, ICS.

[5] Larry Kaplan,et al. The Gemini System Interconnect , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[6] Courtenay T. Vaughan,et al. Zoltan data management services for parallel dynamic applications , 2002, Comput. Sci. Eng..

[7] Charles E. Leiserson,et al. Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[8] Torsten Hoefler,et al. The PERCS High-Performance Interconnect , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[9] Chao Yang,et al. Topology-Aware Mappings for Large-Scale Eigenvalue Problems , 2012, Euro-Par.

[10] William J. Dally,et al. Cost-Efficient Dragonfly Topology for Large-Scale Systems , 2009, IEEE Micro.

[11] Jake K. Aggarwal,et al. A Mapping Strategy for Parallel Processing , 1987, IEEE Transactions on Computers.

[12] Philip Heidelberger,et al. Blue Gene/L torus interconnection network , 2005, IBM J. Res. Dev..

[13] Scott F. Midkiff,et al. Heuristic Technique for Processor and Link Assignment in Multicomputers , 1991, IEEE Trans. Computers.

[14] Jeffrey M. Squyres,et al. Locality-Aware Parallel Process Mapping for Multi-core HPC Systems , 2011, 2011 IEEE International Conference on Cluster Computing.

[15] Mohan Kumar,et al. On generalized fat trees , 1995, Proceedings of 9th International Parallel Processing Symposium.

[16] E. Cuthill,et al. Reducing the bandwidth of sparse symmetric matrices , 1969, ACM '69.

[17] Brian W. Kernighan,et al. An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[18] Takao Hatazaki,et al. Rank Reordering Strategy for MPI Topology Creation Functions , 1998, PVM/MPI.

[19] Hubert Ritzdorf,et al. The scalable process topology interface of MPI 2.2 , 2011, Concurr. Comput. Pract. Exp..

[20] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.

[21] Guillaume Mercier,et al. Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments , 2009, PVM/MPI.

[22] Arnold L. Rosenberg,et al. Issues in the Study of Graph Embeddings , 1980, WG.

[23] Minna Palmroth,et al. Topology Aware Process Mapping , 2012, PARA.

[24] Timothy Roscoe,et al. VF2x: Fast, Efficient Virtual Network Mapping for Real Testbed Workloads , 2012, TRIDENTCOM.

[25] Laxmikant V. Kalé,et al. Benefits of Topology Aware Mapping for Mesh Interconnects , 2008, Parallel Process. Lett..

[26] Kenji Ono,et al. Automatically optimized core mapping to subdomains of domain decomposition method on multicore parallel environments , 2013 .

[27] Shang-Hua Teng,et al. How Good is Recursive Bisection? , 1997, SIAM J. Sci. Comput..

[28] Shahid H. Bokhari,et al. On the Mapping Problem , 1981, IEEE Transactions on Computers.

[29] B. Brandfass,et al. Rank reordering for MPI communication optimization , 2013 .

[30] Fabrizio Petrini,et al. k-ary n-trees: high performance networks for massively parallel architectures , 1997, Proceedings 11th International Parallel Processing Symposium.