Network Optimization for High Performance Cloud Computing

Once thought of as a technology restricted primarily to the scientific community, High-performance Computing (HPC) has now been established as an important value creation tool for the enterprises. Predominantly, the enterprise HPC is fueled by the needs for high-performance data analytics (HPDA) and large-scale machine learning – trades instrumental to business growth in today’s competitive markets. Cloud computing, characterized by the paradigm of on-demand network access to computational resources, has great potential of bringing HPC capabilities to a broader audience. Clouds employing traditional lossy network technologies, however, at large, have not proved to be sufficient for HPC applications. Both the traditional HPC workloads and HPDA require high predictability, large bandwidths, and low latencies, features which combined are not readily available using best-effort cloud networks. On the other hand, lossless interconnection networks commonly deployed in HPC systems, lack the flexibility needed for dynamic cloud environments. In this thesis, we identify and address research challenges that hinder the realization of an efficient HPC cloud computing platform, utilizing the InfiniBand interconnect as a demonstration technology. In particular, we address challenges related to efficient routing, load-balancing, low-overhead virtualization, performance isolation, and fast network reconfiguration, all to improve the utilization and flexibility of the underlying interconnect of an HPC cloud. In addition, we provide a framework to realize a self-adaptive network architecture for HPC clouds, offering dynamic and autonomic adaptation of the underlying interconnect according to varying traffic patterns, resource availability, workload distribution, and also in accordance with service provider defined policies. The work presented in this thesis helps bridging the performance gap between the cloud and traditional HPC infrastructures; the thesis provides practical solutions to enable an efficient, flexible, multi-tenant HPC network suitable for high-performance cloud computing.

[1]  Albert G. Greenberg,et al.  The cost of a cloud: research problems in data center networks , 2008, CCRV.

[2]  Eitan Zahavi Fat-tree routing and node ordering providing contention free traffic for MPI global collectives , 2012, J. Parallel Distributed Comput..

[3]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[4]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[5]  Victor R. Basili,et al.  Software development: a paradigm for the future , 1989, [1989] Proceedings of the Thirteenth Annual International Computer Software & Applications Conference.

[6]  Torsten Hoefler,et al.  Netgauge: A Network Performance Measurement Framework , 2007, HPCC.

[7]  Arnold O. Allen Probability, Statistics, and Queueing Theory , 1978 .

[8]  José Duato,et al.  A theory for deadlock-free dynamic network reconfiguration. Part I , 2005, IEEE Transactions on Parallel and Distributed Systems.

[9]  Feroz Zahid,et al.  Compact network reconfiguration in fat-trees , 2016, The Journal of Supercomputing.

[10]  Carlo Ghezzi,et al.  Model evolution by run-time parameter adaptation , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[11]  Darren J. Kerbyson,et al.  Optimized InfiniBand TM fat-tree routing for shift all-to-all communication patterns , 2010, ISC 2010.

[12]  Tse-Yun Feng,et al.  On a Class of Multistage Interconnection Networks , 1980, IEEE Transactions on Computers.

[13]  Ian Lumb,et al.  A Taxonomy and Survey of Cloud Computing Systems , 2009, 2009 Fifth International Joint Conference on INC, IMS and IDC.

[14]  J. E. Thornton,et al.  The CDC 6600 Project , 1980, Annals of the History of Computing.

[15]  José Duato,et al.  RUFT: Simplifying the Fat-Tree Topology , 2008, 2008 14th IEEE International Conference on Parallel and Distributed Systems.

[16]  Hua Zou,et al.  A dynamic load balancing strategy for cloud computing platform based on exponential smoothing forecast , 2011, 2011 IEEE International Conference on Cloud Computing and Intelligence Systems.

[17]  Fabrizio Petrini,et al.  k-ary n-trees: high performance networks for massively parallel architectures , 1997, Proceedings 11th International Parallel Processing Symposium.

[18]  Bruce Momjian,et al.  PostgreSQL: Introduction and Concepts , 2000 .

[19]  Feroz Zahid,et al.  Partition-Aware Routing to Improve Network Isolation in Infiniband Based Multi-tenant Clusters , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[20]  Jean-Yves Le Boudec,et al.  Network Calculus: A Theory of Deterministic Queuing Systems for the Internet , 2001 .

[21]  Dhabaleswar K. Panda,et al.  Slurm-V: Extending Slurm for Building Efficient HPC Cloud with SR-IOV and IVShmem , 2016, Euro-Par.

[22]  David M. Nicol,et al.  On k-ary n-cubes: Theory and Applications , 2003, Discret. Appl. Math..

[23]  Ratan Mishra,et al.  Ant colony Optimization: A Solution of Load balancing in Cloud , 2012 .

[24]  José Duato,et al.  On the Infiniband subnet discovery process , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[25]  Norman Biggs Algebraic Graph Theory: Index , 1974 .

[26]  Dhabaleswar K. Panda,et al.  High-Performance Virtual Machine Migration Framework for MPI Applications on SR-IOV Enabled InfiniBand Clusters , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[27]  Dhabaleswar K. Panda,et al.  MVAPICH2 over OpenStack with SR-IOV: An Efficient Approach to Build HPC Clouds , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[28]  Antonio Robles,et al.  Supporting fully adaptive routing in InfiniBand networks , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[29]  Vincenzo Catania,et al.  A methodology for design of application specific deadlock-free routing algorithms for NoC systems , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).

[30]  Z. Ding,et al.  Level-wise Scheduling Algorithm for Fat Tree Interconnection Networks , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[31]  Michael A. Rappa,et al.  The utility business model and the future of computing services , 2004, IBM Syst. J..

[32]  Robert B. Cooper,et al.  An Introduction To Queueing Theory , 2016 .

[33]  Charles Eames,et al.  A computer perspective: background to the computer age (new ed.) , 1990 .

[34]  Wei Ge,et al.  The Sunway TaihuLight supercomputer: system and applications , 2016, Science China Information Sciences.

[35]  David Levine,et al.  A high-performance computing toolset for relatedness and principal component analysis of SNP data , 2012, Bioinform..

[36]  Kuo-Qin Yan,et al.  Towards a Load Balancing in a three-level cloud computing network , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[37]  Feroz Zahid,et al.  Efficient routing and reconfiguration in virtualized HPC environments with vSwitch‐enabled lossless networks , 2019, Concurr. Comput. Pract. Exp..

[38]  Thomas L. Sterling,et al.  A High-Performance Computing Forecast: Partly Cloudy , 2009, Computing in Science & Engineering.

[39]  Claes Wohlin,et al.  Experimentation in Software Engineering , 2000, The Kluwer International Series in Software Engineering.

[40]  Torsten Hoefler,et al.  Slim Fly: A Cost Effective Low-Diameter Network Topology , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[41]  Andrea De Mauro,et al.  What is big data? A consensual definition and a review of key research topics , 2015, AIP Conference Proceedings.

[42]  José Duato,et al.  Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[43]  Marius Hillenbrand,et al.  High performance cloud computing , 2013, Future Gener. Comput. Syst..

[44]  Satoshi Matsuoka,et al.  Fail-in-Place Network Design: Interaction Between Topology, Routing Algorithm and Failures , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[45]  Prashant J. Shenoy,et al.  Cost-Aware Cloud Bursting for Enterprise Applications , 2014, TOIT.

[46]  Theodore R. Bashkow,et al.  A large scale, homogeneous, fully distributed parallel machine, I , 1977, ISCA '77.

[47]  Gordon Bell,et al.  Ethernet: Distributed Packet Switching for Local Computer Networks , 1976 .

[48]  Pankesh Patel,et al.  Service Level Agreement in Cloud Computing , 2009 .

[49]  Sven-Arne Reinemo,et al.  sFtree: A fully connected and deadlock-free switch-to-switch routing algorithm for fat-trees , 2012, TACO.

[50]  Rudolf Hornig,et al.  An overview of the OMNeT++ simulation environment , 2008, Simutools 2008.

[51]  Sven-Arne Reinemo,et al.  InfiniBand congestion control: modelling and validation , 2011, SimuTools.

[52]  Rupak Biswas,et al.  Performance evaluation of Amazon EC2 for NASA HPC applications , 2012, ScienceCloud '12.

[53]  Olav Lysne,et al.  On the Relation between Congestion Control, Switch Arbitration and Fairness , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[54]  Robert J. Creasy,et al.  The Origin of the VM/370 Time-Sharing System , 1981, IBM J. Res. Dev..

[55]  Paolo Bientinesi,et al.  HPC on Competitive Cloud Resources , 2010, Handbook of Cloud Computing.

[56]  Feroz Zahid,et al.  A Weighted Fat-Tree Routing Algorithm for Efficient Load-Balancing in Infini Band Enterprise Clusters , 2015, 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[57]  Feroz Zahid,et al.  Efficient network isolation and load balancing in multi-tenant HPC clusters , 2017, Future Gener. Comput. Syst..

[58]  Erol Gelenbe,et al.  Energy-Efficient Cloud Computing , 2010, Comput. J..

[59]  Avinoam Kolodny,et al.  Links as a Service (LaaS): Guaranteed Tenant Isolation in the Shared Cloud , 2019, IEEE Journal on Selected Areas in Communications.

[60]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[61]  Mohan Kumar,et al.  On generalized fat trees , 1995, Proceedings of 9th International Parallel Processing Symposium.

[62]  Torsten Wilde,et al.  A Case Study of Energy Aware Scheduling on SuperMUC , 2014, ISC.

[63]  Subasish Mohapatra,et al.  Virtualization: A Survey on Concepts, Taxonomy and Associated Security Issues , 2010, 2010 Second International Conference on Computer and Network Technology.

[64]  Torsten Hoefler,et al.  The Effect of Network Noise on Large-Scale Collective Communications , 2009, Parallel Process. Lett..

[65]  Aaron Dubrow What Got Done in One Year at NSF's Stampede Supercomputer , 2015, Computing in Science & Engineering.

[66]  Hong Liu,et al.  Energy proportional datacenter networks , 2010, ISCA.

[67]  Ritu Arora,et al.  Conquering Big Data with High Performance Computing , 2016, Springer International Publishing.

[68]  Jimy Dudhia,et al.  The Weather Research and Forecast Model: software architecture and performance [presentation] , 2005 .

[69]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[70]  Larry Kaplan,et al.  The Gemini System Interconnect , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[71]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[72]  Jesper Andersson,et al.  On interacting control loops in self-adaptive systems , 2011, SEAMS '11.

[73]  Robert L. Grossman,et al.  The Case for Cloud Computing , 2009, IT Professional.

[74]  Mateo Valero,et al.  Oblivious routing schemes in extended generalized Fat Tree networks , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[75]  John Evans,et al.  HPC in a HEP lab: lessons learned from setting up cost-effective HPC clusters , 2015 .

[76]  A. Varga,et al.  Using the OMNeT++ discrete event simulation system in education , 1999 .

[77]  Feroz Zahid,et al.  A Self-Adaptive Network for HPC Clouds: Architecture, Framework, and Implementation , 2018, IEEE Transactions on Parallel and Distributed Systems.

[78]  K Y Sanbonmatsu,et al.  High performance computing in biology: multimillion atom simulations of nanoscale systems. , 2007, Journal of structural biology.

[79]  D. West Introduction to Graph Theory , 1995 .

[80]  Torsten Hoefler,et al.  Deadlock-Free Oblivious Routing for Arbitrary Topologies , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[81]  Tor Skeie,et al.  Towards the InfiniBand SR-IOV vSwitch Architecture , 2015, 2015 IEEE International Conference on Cluster Computing.

[82]  Wu-chun Feng,et al.  The Quadrics Network: High-Performance Clustering Technology , 2002, IEEE Micro.

[83]  Ioanna D. Constantiou,et al.  New games, new rules: big data and the changing context of strategy , 2015, J. Inf. Technol..

[84]  Peter J. Denning,et al.  ACM President's Letter: What is experimental computer science? , 1980, CACM.

[85]  Geoffrey C. Fox,et al.  Big Data, Simulations and HPC Convergence , 2015, WBDB.

[86]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[87]  William J. Dally,et al.  Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.

[88]  Ladan Tahvildari,et al.  Self-adaptive software: Landscape and research challenges , 2009, TAAS.

[89]  Lars Grunske,et al.  Lightweight Adaptive Filtering for Efficient Learning and Updating of Probabilistic Models , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[90]  Shujia Zhou,et al.  Case study for running HPC applications in public clouds , 2010, HPDC '10.

[91]  Olav Lysne,et al.  Layered shortest path (LASH) routing in irregular system area networks , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[92]  Canqun Yang,et al.  MilkyWay-2 supercomputer: system and application , 2014, Frontiers of Computer Science.

[93]  Venkataraman Ramesh,et al.  Research in computer science: an empirical study , 2004, J. Syst. Softw..

[94]  Bill Long,et al.  An exhaustive DNA micro-satellite map of the human genome using high performance computing. , 2003, Genomics.

[95]  Torsten Hoefler,et al.  ORCS : An Oblivious Routing Congestion Simulator , 2009 .

[96]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[97]  Xuejie Zhang,et al.  A load balancing mechanism based on ant colony and complex network theory in open cloud computing federation , 2010, 2010 The 2nd International Conference on Industrial Mechatronics and Automation.

[98]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[99]  Charles E. Leiserson,et al.  Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[100]  Timothy Mark Pinkston,et al.  On Deadlocks in Interconnection Networks , 1997, ISCA.

[101]  Don E Maxwell,et al.  Titan: 20-Petaflop Cray XK7 at Oak Ridge National Laboratory , 2017 .

[102]  Frank Bellosa,et al.  Virtual InfiniBand clusters for HPC clouds , 2012, CloudCP '12.

[103]  Danny Weyns,et al.  A survey of formal methods in self-adaptive systems , 2012, C3S2E '12.

[104]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[105]  Mark Horowitz,et al.  Hardware Fault Containment In Scalable Shared-memory Multiprocessors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[106]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[107]  G. Edward Suh,et al.  Application-aware deadlock-free oblivious routing , 2009, ISCA '09.

[108]  Sven-Arne Reinemo,et al.  Discovery and Routing of Degraded Fat-Trees , 2012, 2012 13th International Conference on Parallel and Distributed Computing, Applications and Technologies.

[109]  Petr Jan Horn,et al.  Autonomic Computing: IBM's Perspective on the State of Information Technology , 2001 .

[110]  Feroz Zahid,et al.  SlimUpdate: Minimal Routing Update for Performance-Based Reconfigurations in Fat-Trees , 2015, 2015 IEEE International Conference on Cluster Computing.

[111]  Alexandru Iosup,et al.  On the Performance Variability of Production Cloud Services , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[112]  Keith D. Underwood,et al.  Intel® Omni-path Architecture: Enabling Scalable, High Performance Fabrics , 2015, 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects.

[113]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[114]  Hoefler Torsten,et al.  Scheduling-Aware Routing for Supercomputers , 2016 .

[115]  Bianca Schroeder,et al.  A Large-Scale Study of Failures in High-Performance Computing Systems , 2010, IEEE Trans. Dependable Secur. Comput..

[116]  Emilio Luque,et al.  Fast-Response Dynamic Routing Balancing for high-speed interconnection networks , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[117]  Robert L. Glass,et al.  The software-research crisis , 1994, IEEE Software.

[118]  Robert M. Davison,et al.  Principles of canonical action research , 2004, Inf. Syst. J..

[119]  John Shalf,et al.  Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[120]  Xiaobo Zhou,et al.  Leading-edge research in cluster, cloud, and grid computing: Best papers from the IEEE/ACM CCGrid 2015 conference , 2017, Future Gener. Comput. Syst..

[121]  Paul Rad,et al.  Low-latency software defined network for high performance clouds , 2015, 2015 10th System of Systems Engineering Conference (SoSE).

[122]  Paolo Bientinesi,et al.  Can cloud computing reach the top500? , 2009, UCHPC-MAW '09.

[123]  Mary Shaw,et al.  Engineering Self-Adaptive Systems through Feedback Loops , 2009, Software Engineering for Self-Adaptive Systems.

[124]  Daniel Franco,et al.  Dynamic routing balancing on InfiniBand network , 2008 .

[125]  Olav Lysne,et al.  Topology Agnostic Dynamic Quick Reconfiguration for Large-Scale Interconnection Networks , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[126]  Ratko V. Tomic Network Throughput Optimization via Error Correcting Codes , 2013, ArXiv.

[127]  Radia J. Perlman,et al.  An algorithm for distributed computation of a spanningtree in an extended LAN , 1985, SIGCOMM '85.

[128]  Martin Bichler,et al.  Service-oriented computing , 2006, Computer.

[129]  G. Dodig-Crnkovic Scientific Methods in Computer Science , 2002 .

[130]  Haibing Guan,et al.  A survey on data center networking for cloud computing , 2015, Comput. Networks.

[131]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[132]  Sven-Arne Reinemo,et al.  Multi-homed Fat-Tree Routing with InfiniBand , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[133]  Peyman Oreizy,et al.  An architecture-based approach to self-adaptive software , 1999, IEEE Intell. Syst..

[134]  Torsten Hoefler,et al.  Multistage switches are not crossbars: Effects of static routing in high-performance networks , 2008, 2008 IEEE International Conference on Cluster Computing.

[135]  Douglas F. Parkhill,et al.  The Challenge of the Computer Utility , 1966 .

[136]  Piotr Luszczek,et al.  Design and Implementation of the HPC Challenge Benchmark Suite , 2011 .

[137]  Pedro López,et al.  Deterministic versus Adaptive Routing in Fat-Trees , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[138]  Michael Burrows,et al.  Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links , 1991, IEEE J. Sel. Areas Commun..

[139]  William J. Dally,et al.  Performance Analysis of k-Ary n-Cube Interconnection Networks , 1987, IEEE Trans. Computers.

[140]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[141]  Eitan Zahavi D-Mod-K Routing Providing Non-Blocking Traffic for Shift Permutations on Real Life Fat Trees , 2010 .

[142]  Lionel M. Ni,et al.  Issues in designing truly scalable interconnection networks , 1996, 1996 Proceedings ICPP Workshop on Challenges for Parallel Processing.

[143]  Abhishek Gupta,et al.  Evaluation of HPC Applications on Cloud , 2011, 2011 Sixth Open Cirrus Summit.

[144]  Avinoam Kolodny,et al.  Quasi Fat Trees for HPC Clouds and Their Fault-Resilient Closed-Form Routing , 2014, 2014 IEEE 22nd Annual Symposium on High-Performance Interconnects.

[145]  Gerald Tesauro,et al.  Reinforcement Learning in Autonomic Computing: A Manifesto and Case Studies , 2007, IEEE Internet Computing.

[146]  Jonathan Murray,et al.  Cloud Computing: From Scarcity to Abundance , 2015 .

[147]  Robert W. Horst,et al.  A flexible ServerNet-based fault-tolerant architecture , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[148]  Gordana Dodig Crnkovic,et al.  Constructive Research and Info-computational Knowledge Generation , 2010 .

[149]  Victor R. Basili,et al.  The Experimental Paradigm in Software Engineering , 1992, Experimental Software Engineering Issues.

[150]  Torsten Hoefler,et al.  Fast pattern-specific routing for fat tree networks , 2013, ACM Trans. Archit. Code Optim..

[151]  Cyriel Minkenberg,et al.  Quiet Neighborhoods: Key to Protect Job Performance Predictability , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[153]  G. F. Newell,et al.  Applications of Queueing Theory. , 1983 .

[154]  Raouf Boutaba,et al.  Cloud computing: state-of-the-art and research challenges , 2010, Journal of Internet Services and Applications.

[155]  R. Stebbins Exploratory research in the social sciences , 2001 .

[156]  William J. Dally,et al.  Flattened butterfly: a cost-efficient topology for high-radix networks , 2007, ISCA '07.

[157]  Andrzej Jajszczyk Nonblocking, repackable, and rearrangeable Clos networks: fifty years of the theory evolution , 2003, IEEE Commun. Mag..

[158]  Victoria L. Rubin,et al.  Veracity Roadmap: Is Big Data Objective, Truthful and Credible? , 2014 .