Tools for Benchmarking , Tracing , and Simulating SHMEM Applications

I. INTRODUCTION Two-sided communication has been the dominant protocol for developing high performance applications. However the cost of synchronization for point to point communication makes it challenging when developing systems with more than 100,000 cores. This problem will further be exacerbated when scaling for exascale systems where core counts may exceed 2