Optimal low-latency network topologies for cluster performance enhancement
暂无分享,去创建一个
Weifeng Liu | Yuefan Deng | Meng Guo | Zhipeng Xu | Alexandre F. Ramos | Xiaolong Huang | Alexandre F. Ramos | Yuefan Deng | Meng Guo | Zhipeng Xu | Xiaolong Huang | Weifeng Liu
[1] Norman P. Jouppi,et al. Readings in computer architecture , 2000 .
[2] Pedro López,et al. Towards an Efficient Fat-Tree like Topology , 2012, Euro-Par.
[3] William J. Dally,et al. Express Cubes: Improving the Performance of k-Ary n-Cube Interconnection Networks , 1989, IEEE Trans. Computers.
[4] Turki F. Al-Somani,et al. Topological Properties of Hierarchical Interconnection Networks: A Review and Comparison , 2011, J. Electr. Comput. Eng..
[5] William J. Dally,et al. Topology optimization of interconnection networks , 2006, IEEE Computer Architecture Letters.
[6] Yuefan Deng,et al. Symmetry insights for design of supercomputer network topologies: roots and weights lattices , 2012 .
[7] Peng Zhang,et al. Evaluation of Various Networks Configurated by Adding Bypass or Torus Links , 2015, IEEE Transactions on Parallel and Distributed Systems.
[8] Lali Barrière,et al. The generalized hierarchical product of graphs , 2009, Discret. Math..
[9] Yuefan Deng,et al. Symmetry-guided design of topologies for supercomputer networks , 2017, International Journal of Modern Physics C.
[10] Charles E. Leiserson,et al. Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.
[11] Mike Higgins,et al. Cray Cascade: A scalable HPC system based on a Dragonfly network , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[12] Jack J. Dongarra,et al. HPC Challenge Benchmark , 2011, Encyclopedia of Parallel Computing.
[13] William J. Dally,et al. Performance Analysis of k-Ary n-Cube Interconnection Networks , 1987, IEEE Trans. Computers.
[14] William J. Dally,et al. Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.
[15] Ibm Blue,et al. Overview of the IBM Blue Gene/P Project , 2008, IBM J. Res. Dev..
[16] Junming Xu. Topological Structure and Analysis of Interconnection Networks , 2002, Network Theory and Applications.
[17] Torsten Hoefler,et al. Slim Fly: A Cost Effective Low-Diameter Network Topology , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[18] M. M. Hafizur Rahman,et al. Architecture and Network-on-Chip Implementation of a New Hierarchical Interconnection Network , 2015, J. Circuits Syst. Comput..
[19] Philip Heidelberger,et al. The IBM Blue Gene/Q interconnection network and message unit , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[20] Jan Goedgebeur,et al. Generation of cubic graphs and snarks with large girth , 2017, J. Graph Theory.
[21] Daisuke Takahashi,et al. High-Performance Radix-2, 3 and 5 Parallel 1-D Complex FFT Algorithms for Distributed-Memory Parallel Computers , 2000, The Journal of Supercomputing.
[22] Ryuhei Mori,et al. Average shortest path length of graphs of diameter 3 , 2016, 2016 Tenth IEEE/ACM International Symposium on Networks-on-Chip (NOCS).
[23] Ryosuke Mizuno,et al. Constructing large-scale low-latency network from small optimal networks , 2016, 2016 Tenth IEEE/ACM International Symposium on Networks-on-Chip (NOCS).
[24] Brendan D. McKay,et al. Generation of Cubic graphs , 2011, Discret. Math. Theor. Comput. Sci..
[25] Elwood S. Buffa,et al. Graph Theory with Applications , 1977 .
[26] W. J. Langford. Statistical Methods , 1959, Nature.
[27] Kemal Efe. A Variation on the Hypercube with Lower Diameter , 1991, IEEE Trans. Computers.
[28] Jack Dongarra,et al. Introduction to the HPCChallenge Benchmark Suite , 2004 .
[29] Srinivasan Keshav,et al. Quartz , 2014, SIGCOMM.
[30] Hideharu Amano,et al. Recursive Diagonal Torus: An Interconnection Network for Massively Parallel Computers , 2001, IEEE Trans. Parallel Distributed Syst..
[31] Xiangke Liao,et al. High Performance Interconnect Network for Tianhe System , 2015, Journal of Computer Science and Technology.
[32] Rolf Rabenseifner,et al. Benchmark design for characterization of balanced high-performance architectures , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[33] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[34] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.
[35] Yuefan Deng,et al. A new record of graph enumeration enabled by parallel processing , 2019 .
[36] Yvain Thonnart,et al. An analytical method for evaluating Network-on-Chip performance , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[37] Peter Sanders,et al. Think Locally, Act Globally: Highly Balanced Graph Partitioning , 2013, SEA.
[38] Wei Ge,et al. The Sunway TaihuLight supercomputer: system and applications , 2016, Science China Information Sciences.
[39] Thomas E. Anderson,et al. F10: A Fault-Tolerant Engineered Network , 2013, NSDI.
[40] Daisuke Takahashi,et al. A Blocking Algorithm for Parallel 1-D FFT on Shared-Memory Parallel Computers , 2002, PARA.
[41] Hong Shen,et al. A Low Cost Hybrid Fat-tree Interconnection Network , 1998 .
[42] Donald D. Cowan,et al. A partial census of trivalent generalized Moore networks , 1975 .
[43] Lali Barrière,et al. The hierarchical product of graphs , 2009, Discret. Appl. Math..
[44] Shin'ichi Miura,et al. HyperX topology: first at-scale implementation and comparison to the fat-tree , 2019, SC.
[45] Toshiyuki Shimizu,et al. Tofu: A 6D Mesh/Torus Interconnect for Exascale Computers , 2009, Computer.
[46] Larry Kaplan,et al. The Gemini System Interconnect , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.
[47] Steven L. Scott,et al. The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus , 1996 .
[48] Philip Heidelberger,et al. Blue Gene/L torus interconnection network , 2005, IBM J. Res. Dev..
[49] Jung-hyun Seo,et al. The hierarchical Petersen network: a new interconnection network with fixed degree , 2017, The Journal of Supercomputing.
[50] Abdel Elah Al-Ayyoub,et al. The Cross Product of Interconnection Networks , 1997, IEEE Trans. Parallel Distributed Syst..
[51] Mitsuhisa Sato,et al. A Method for Order/Degree Problem Based on Graph Symmetry and Simulated Annealing with MPI/OpenMP Parallelization , 2019, HPC Asia.
[52] F. Harary,et al. A survey of the theory of hypercube graphs , 1988 .
[53] Henri Casanova,et al. Versatile, scalable, and accurate simulation of distributed applications and platforms , 2014, J. Parallel Distributed Comput..
[54] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[55] David H. Bailey,et al. NAS parallel benchmark results , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.
[56] Christoph Lenzen,et al. CLEX: Yet Another Supercomputer Architecture? , 2016, ArXiv.
[57] J. A. Bondy,et al. Graph Theory with Applications , 1978 .
[58] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .
[59] Keith D. Underwood,et al. SeaStar Interconnect: Balanced Bandwidth for Scalable Performance , 2006, IEEE Micro.
[60] Hadrien Mélot,et al. House of Graphs: A database of interesting graphs , 2012, Discret. Appl. Math..
[61] Susumu Horiguchi,et al. Shifted Recursive Torus interconnection for high performance computing , 1997, Proceedings High Performance Computing on the Information Superhighway. HPC Asia '97.
[62] Teruaki Kitasuka,et al. A heuristic method of generating diameter 3 graphs for order/degree problem (invited paper) , 2016, 2016 Tenth IEEE/ACM International Symposium on Networks-on-Chip (NOCS).
[63] Peng Zhang,et al. Interlacing Bypass Rings to Torus Networks for More Efficient Networks , 2011, IEEE Transactions on Parallel and Distributed Systems.
[64] V. G. Cerf,et al. A lower bound on the average shortest path length in regular graphs , 1974, Networks.
[65] Brian W. Barrett,et al. Introducing the Graph 500 , 2010 .
[66] Markus Meringer,et al. Fast generation of regular graphs and construction of cages , 1999, J. Graph Theory.
[67] Dan Li,et al. Impact of Network Topology on the Performance of DML: Theoretical Analysis and Practical Factors , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.
[68] Daisuke Takahashi,et al. The HPC Challenge (HPCC) benchmark suite , 2006, SC.
[69] Ana Paula Couto da Silva,et al. Performance Prediction of Cloud-Based Big Data Applications , 2018, ICPE.
[70] Trevor Mudge,et al. Hypercube supercomputers , 1989, Proc. IEEE.
[71] Hideharu Amano,et al. Prediction router: Yet another low latency on-chip router architecture , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[72] Deron Liang,et al. Novel Hierarchical Interconnection Networks for High-Performance Multicomputer Systems , 2004, J. Inf. Sci. Eng..