Low-latency software defined network for high performance clouds

Multi-tenant clouds with resource virtualization offer elasticity of resources and elimination of initial cluster setup cost and time for applications. However, poor network performance, performance variation and noisy neighbors are some of the challenges for execution of high performance applications on public clouds. Utilizing these virtualized resources for scientific applications, which have complex communication patterns, require low latency communication mechanisms and rich set of communication constructs. To minimize the virtualization overhead, a novel approach for low latency network for HPC Clouds is proposed and implemented over a multi-technology software defined network. The efficiency of the proposed low-latency Software Defined Networking is analyzed and evaluated for high performance applications. The results of the experiments show that the latest Mellanox FDR InfiniBand interconnect and Mellanox OpenStack plugin gives the best performance for implementing VM-based high performance clouds with large message sizes.

[1]  Xin Yuan,et al.  A comparative study of high-performance computing on the cloud , 2013, HPDC.

[2]  Xiaowei Yang,et al.  High performance network virtualization with SR-IOV , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[3]  Abhishek Gupta,et al.  Evaluation of HPC Applications on Cloud , 2011, 2011 Sixth Open Cirrus Summit.

[4]  Dhabaleswar K. Panda,et al.  High-Performance Design of HBase with RDMA over InfiniBand , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[5]  Geoffrey C. Fox,et al.  Distributed and Cloud Computing: From Parallel Processing to the Internet of Things , 2011 .

[6]  Dejan S. Milojicic,et al.  Improving HPC Application Performance in Cloud through Dynamic Load Balancing , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[7]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[8]  Lavanya Ramakrishnan,et al.  Evaluating Interconnect and Virtualization Performance forHigh Performance Computing , 2011, PERV.

[9]  Jeffrey Shafer,et al.  I/O virtualization bottlenecks in cloud computing today , 2010 .

[10]  Pontus Sköldström,et al.  A Use-Case Based Analysis of Network Management Functions in the ONF SDN Model , 2012, 2012 European Workshop on Software Defined Networking.

[11]  Dejan S. Milojicic,et al.  HPC-Aware VM Placement in Infrastructure Clouds , 2013, 2013 IEEE International Conference on Cloud Engineering (IC2E).

[12]  Myung-Ki Shin,et al.  Software-defined networking (SDN): A reference architecture and open APIs , 2012, 2012 International Conference on ICT Convergence (ICTC).

[13]  Lavanya Ramakrishnan,et al.  Evaluating interconnect and virtualization performance for high performance computing , 2011, HiPC 2011.

[14]  Dhabaleswar K. Panda,et al.  Performance Analysis and Evaluation of InfiniBand FDR and 40GigE RoCE on HPC and Cloud Computing Systems , 2012, 2012 IEEE 20th Annual Symposium on High-Performance Interconnects.

[15]  Alexandru Iosup,et al.  Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing , 2011, IEEE Transactions on Parallel and Distributed Systems.

[16]  Jose Renato Santos,et al.  Bridging the Gap between Software and Hardware Techniques for I/O Virtualization , 2008, USENIX Annual Technical Conference.

[17]  Dhabaleswar K. Panda,et al.  SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[18]  Dhabaleswar K. Panda,et al.  TupleQ: Fully-asynchronous and zero-copy MPI over InfiniBand , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.