论文信息 - Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms

InfiniBand is an emerging networking technology that is gaining rapid acceptance in the HPC domain. Currently, several systems in the Top500 list use InfiniBand as their primary interconnect, with more being planned for near future. The fundamental architecture of the systems are undergoing a sea-change due to the advent of commodity multi-core computing. Due to the increase in the number of processes in each compute node, the network interface is expected to handle more communication traffic as compared to older dual or quad SMP systems. Thus, the network architecture should provide scalable performance as the number of processing cores increase. ConnectX is the fourth generation InfiniBand adapter from Mellanox Technologies. Its novel architecture enhances the scalability and performance of InfiniBand on multi-core clusters. In this paper, we carry out an in-depth performance analysis of ConnectX architecture comparing it with the third generation InfiniHost III architecture on the Intel Bensley platform with Dual Clovertown processors. Our analysis reveals that the aggregate bandwidth for small and medium sized messages can be increased by a factor of 10 as compared to the third generation InfiniHost III adapters. Similarly, RDMA-Write and RDMA-Read latencies for 1 -byte messages can be reduced by a factor of 6 and 3, respectively, even when all cores are communicating simultaneously. Evaluation with communication kernel Halo reveals a performance benefit of a factor of 2 to 5. Finally, the performance of LAMMPS, a molecular dynamics simulator, is improved by 10% for the in.rhodo benchmark.

Sayantan Sur | Dhabaleswar K. Panda | Matthew J. Koop | Lei Chai

[1] Dave Olson,et al. Pathscale InfiniPath: a first look , 2005, 13th Symposium on High Performance Interconnects (HOTI'05).

[2] Keith D. Underwood,et al. A preliminary analysis of the InfiniPath and XD1 network interfaces , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[3] Douglas Doerfler,et al. Measuring MPI Send and Receive Overhead and Application Availability in High Performance Network Interfaces , 2006, PVM/MPI.

[4] Steve Plimpton,et al. Fast parallel algorithms for short-range molecular dynamics , 1993 .

[5] George Bosilca,et al. Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.