Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing
暂无分享,去创建一个
Jens Domke | David K. Lowenthal | Jayaraman J. Thiagarajan | Abhinav Bhatele | Nikhil Jain | Staci A. Smith | Clara E. Cromey | Staci A. Smith | D. Lowenthal | A. Bhatele | Jens Domke | Nikhil Jain
[1] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[2] William J. Dally,et al. Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.
[3] A. B. Langdon,et al. Filamentation and forward Brillouin scatter of entire smoothed and aberrated laser beams , 2000 .
[4] Nicholas J. Wright,et al. Understanding Performance Variability on the Aries Dragonfly Network , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).
[5] Bronis R. de Supinski,et al. The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[6] Kevin Harms,et al. Run-to-run Variability on Xeon Phi based Cray XC Systems , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.
[7] Robert B. Ross,et al. Watch Out for the Bully! Job Interference Study on Dragonfly Network , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[8] Michael E. Papka,et al. ALCF MPI Benchmarks: Understanding Machine-Specific Communication Behavior , 2012, 2012 41st International Conference on Parallel Processing Workshops.
[9] Laxmikant V. Kalé,et al. Predicting application performance using supervised learning on communication features , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[10] Jesús Labarta,et al. Impact of Inter-application Contention in Current and Future HPC Systems , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.
[11] Laxmikant V. Kalé,et al. Identifying the Culprits Behind Network Congestion , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.
[12] Nan Jiang,et al. Network endpoint congestion control for fine-grained communication , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[13] Nan Jiang,et al. Network congestion avoidance through Speculative Reservation , 2012, IEEE International Symposium on High-Performance Comp Architecture.
[14] Eitan Zahavi. D-Mod-K Routing Providing Non-Blocking Traffic for Shift Permutations on Real Life Fat Trees , 2010 .
[15] Torsten Hoefler,et al. Multistage switches are not crossbars: Effects of static routing in high-performance networks , 2008, 2008 IEEE International Conference on Cluster Computing.
[16] Pedro López,et al. Deterministic versus Adaptive Routing in Fat-Trees , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[17] Zhou Tong,et al. A comparative study of SDN and adaptive routing on dragonfly networks , 2017, SC.
[18] Katherine E. Isaacs,et al. There goes the neighborhood: Performance degradation due to nearby jobs , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[19] Enhancing InfiniBand with OpenFlow-Style SDN Capability , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[20] Wu-chun Feng,et al. IMPROVED RESOURCE UTILIZATION WITH BUFFERED COSCHEDULING , 2001, Parallel Algorithms Appl..
[21] François Gygi,et al. Architecture of Qbox: A scalable first-principles molecular dynamics code , 2008, IBM J. Res. Dev..
[22] Hoefler Torsten,et al. Scheduling-Aware Routing for Supercomputers , 2016 .
[23] C. DeTar,et al. Scaling tests of the improved Kogut-Susskind quark action , 1999, hep-lat/9912018.
[24] A. Gentile,et al. Network Performance Counter Monitoring and Analysis on the Cray XC Platform. , 2016 .
[25] Sangeetha Abdu Jyothi,et al. Measuring and Understanding Throughput of Network Topologies , 2014, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[26] Charles E. Leiserson,et al. Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.
[27] Mike Higgins,et al. Cray Cascade: A scalable HPC system based on a Dragonfly network , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.