TAMM: A New Topology-Aware Mapping Method for Parallel Applications on the Tianhe-2A Supercomputer
暂无分享,去创建一个
[1] Emmanuel Jeannot,et al. Topology-aware job mapping , 2018, Int. J. High Perform. Comput. Appl..
[2] B. Brandfass,et al. Rank reordering for MPI communication optimization , 2013 .
[3] Laxmikant V. Kale,et al. Automating topology aware mapping for supercomputers , 2010 .
[4] David H. Bailey,et al. The NAS parallel benchmarks summary and preliminary results , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[5] Torsten Hoefler,et al. Generic topology mapping strategies for large-scale parallel architectures , 2011, ICS '11.
[6] Bruce Hendrickson,et al. The Chaco user`s guide. Version 1.0 , 1993 .
[7] Torsten Hoefler,et al. An Overview of Topology Mapping Algorithms and Techniques in High‐Performance Computing , 2014, HiPC 2014.
[8] Jean Roman,et al. SCOTCH: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs , 1996, HPCN Europe.
[9] Torsten Hoefler,et al. NUMA-aware shared-memory collective communication for MPI , 2013, HPDC.
[10] Guillaume Mercier,et al. Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments , 2009, PVM/MPI.
[11] Iain S. Duff. European Exascale Software Initiative: Numerical Libraries, Solvers and Algorithms , 2011, Euro-Par Workshops.
[12] Emmanuel Jeannot,et al. Process Placement in Multicore Clusters:Algorithmic Issues and Practical Techniques , 2014, IEEE Transactions on Parallel and Distributed Systems.
[13] Ahmad Afsahi,et al. PTRAM: A Parallel Topology-and Routing-Aware Mapping Framework for Large-Scale HPC Systems , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[14] Laxmikant V. Kalé,et al. Automatic topology mapping of diverse large-scale parallel applications , 2017, ICS '17.
[15] Vitus J. Leung,et al. PaCMap: Topology Mapping of Unstructured Communication Patterns onto Non-contiguous Allocations , 2015, ICS.
[16] Chris Walshaw,et al. JOSTLE: multilevel graph partitioning software: an overview , 2007 .
[17] Dhabaleswar K. Panda,et al. Design of network topology aware scheduling services for large InfiniBand clusters , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).
[18] Philippe Olivier Alexandre Navaux,et al. Multi-core aware process mapping and its impact on communication overhead of parallel applications , 2009, 2009 IEEE Symposium on Computers and Communications.
[19] Xiangke Liao,et al. High Performance Interconnect Network for Tianhe System , 2015, Journal of Computer Science and Technology.
[20] Laxmikant V. Kalé,et al. An evaluative study on the effect of contention on message latencies in large supercomputers , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[21] José E. Moreira,et al. Topology Mapping for Blue Gene/L Supercomputer , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[22] Emmanuel Jeannot,et al. Near-Optimal Placement of MPI Processes on Hierarchical NUMA Architectures , 2010, Euro-Par.
[23] Canqun Yang,et al. MilkyWay-2 supercomputer: system and application , 2014, Frontiers of Computer Science.
[24] P. Sadayappan,et al. Task allocation onto a hypercube by recursive mincut bipartitioning , 1988, C3P.
[25] Wenguang Chen,et al. MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters , 2006, ICS '06.
[26] Patrick H. Worley,et al. Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers , 2016, ICPE.
[27] Al Geist,et al. IESP Exascale Challenge: Co-Design of Architectures and Algorithms , 2009, Int. J. High Perform. Comput. Appl..
[28] Jia Wang,et al. Topology mapping of irregular parallel applications on torus-connected supercomputers , 2017, The Journal of Supercomputing.
[29] Yi Zheng,et al. The TH Express high performance interconnect networks , 2014, Frontiers of Computer Science.
[30] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[31] Kefei Wang,et al. The Efficient In-band Management for Interconnect Network in Tianhe-2 System , 2016, 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP).
[32] Ahmad Afsahi,et al. Topology-Aware Rank Reordering for MPI Collectives , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[33] Kwan-Liu Ma,et al. A Visual Analytics System for Optimizing Communications in Massively Parallel Applications , 2017, 2017 IEEE Conference on Visual Analytics Science and Technology (VAST).