Design and implementation of control sequence generator for SDN-enhanced MPI

MPI (Message Passing Interface) offers a suite of APIs for inter-process communication among parallel processes. We have approached to the acceleration of MPI collective communication such as MPI_Bcast and MPI_Allreduce, taking advantage of network programmability brought by Software Defined Networking (SDN). The basic idea is to allow a SDN controller to dynamically control the packet flows generated by MPI collective communication based on the communication pattern and the underlying network conditions. Although our research have succeeded to accelerate an MPI collective communication in terms of execution time, the switching of network control functionality for MPI collective communication along MPI program execution have not been considered yet. This paper presents a mechanism that provides the control sequence for SDN controller to control packet flows based on the communication plan for the entire MPI application. The control sequence encloses a chronologically ordered list of the MPI collectives operated in the MPI application and the process-related information of each in the list. To verify if the SDN-enhanced MPI collectives can be used in combination with the proposed mechanism, the envisioned environment was prototyped. As a result, SDN-enhanced MPI collectives were able to be used in combination.

[1]  George Bosilca,et al.  Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.

[2]  Kohei Ichikawa,et al.  Architecture of a High-Speed MPI_Bcast Leveraging Software-Defined Network , 2013, Euro-Par Workshops.

[3]  Hesham El-Rewini,et al.  Message Passing Interface (MPI) , 2005 .

[4]  Benxiong Huang,et al.  Bandwidth-Aware Scheduling With SDN in Hadoop: A New Trend for Big Data , 2017, IEEE Systems Journal.

[5]  Oscar Naim D'Paola Performance visualization of parallel programs , 1995 .

[6]  Anees Shaikh,et al.  Programming your network at run-time for big data applications , 2012, HotSDN '12.

[7]  Andrew Lumsdaine,et al.  A Component Architecture for LAM/MPI , 2003, PVM/MPI.

[8]  Amith R. Mamidala,et al.  Optimizing MPI Collectives Using Efficient Intra-node Communication Techniques over the Blue Gene/P Supercomputer , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[9]  Fumiyoshi Shoji,et al.  The K computer: Japanese next-generation supercomputer development project , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[10]  Rolf Hempel,et al.  The MPI Standard for Message Passing , 1994, HPCN.

[11]  Ewing Lusk,et al.  Performance visualization for parallel programs , 1993 .

[12]  Keichi Takahashi,et al.  Performance evaluation of SDN-enhanced MPI allreduce on a cluster system with fat-tree interconnect , 2014, 2014 International Conference on High Performance Computing & Simulation (HPCS).

[13]  William Gropp,et al.  User's Guide for MPE: Extensions for MPI Programs , 1998 .

[14]  Philip Heidelberger,et al.  Optimization of MPI collective communication on BlueGene/L systems , 2005, ICS '05.

[15]  Rajeev Thakur,et al.  Improving the Performance of Collective Operations in MPICH , 2003, PVM/MPI.

[16]  Kenichi Miura,et al.  The design of ultra scalable MPI collective communication on the K computer , 2012, Computer Science - Research and Development.

[17]  Michael Menth,et al.  Software-Defined Networking Using OpenFlow: Protocols, Applications and Architectural Design Choices , 2014, Future Internet.

[18]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..