A Secured Large Heterogeneous HPC Cluster System using Massive Parallel Programming Model with Accelerated GPUs

High Performace Computing (HPC) architectures are expected to develop first ExaFlops computer. This Exascale processing framework will be proficient to register ExaFlops estimation every subsequent that is thousands-overlay increment in current Petascale framework. Current advancements are confronting a few difficulties to move toward such outrageous registering framework. It has been anticipated that billion-way of parallelism will be exploited to discover Exascale level secured system that provide massive performance under predefined limitations such as processing cores and power consumption. However, the key elements of the strategies are required to develop a secured ExaFlops level energy efficient system. This study proposes a non-blocking, overlapping and GPU computation based tri-hybird model (OpenMP, CUDA and MPI) model that provide a massive parallelism through different granularity levels. We implemented the three different message passing strategies including and performed the experiments on Aziz-Fujitsu PRIMERGY CX400 supercomputer. It was observed that a comprehensive experimental study has been conducted to validate the performance and energy efficiency of our model. Experimental investigation shows that the EPC could be considered as an initiative and leading model to achieve massive performance through efficient scheme for Exascale computing systems. Keywords—High Performance Computing HPC; MPI; OpenMP; CUDA; Supercomputing Systems

[1]  Amirali Baniasadi,et al.  IPMACC: Open Source OpenACC to CUDA/OpenCL Translator , 2014, ArXiv.

[2]  Hans P. Zima From FORTRAN 77 to locality-aware high productivity languages for peta-scale computing , 2007, Sci. Program..

[3]  David J. Lilja,et al.  Measuring computer performance : A practitioner's guide , 2000 .

[4]  Michael Lang,et al.  Entering the petaflop era: The architecture and performance of Roadrunner , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[5]  Fathy Alboraei Eassa,et al.  Empirical Analysis of HPC Using Different Programming Models , 2016 .

[6]  Jack J. Dongarra,et al.  The quest for petascale computing , 2001, Comput. Sci. Eng..

[7]  Samuel Scott Collis Computers are changing and so must our codes... , 2014 .

[8]  Torsten Hoefler,et al.  Non-Blocking Collective Operations for MPI-2 , 2006 .

[9]  P. Bassanini,et al.  Elliptic Partial Differential Equations of Second Order , 1997 .

[10]  Laxmi N. Bhuyan,et al.  Software techniques to improve virtualized I/O performance on multi-core systems , 2008, ANCS '08.

[11]  Buddy Bland,et al.  Titan - Early experience with the Titan system at Oak Ridge National Laboratory , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[12]  Rupak Biswas,et al.  High performance computing using MPI and OpenMP on multi-core parallel systems , 2011, Parallel Comput..

[13]  Vincent Roberge,et al.  Distribution System Optimization on Graphics Processing Unit , 2017, IEEE Transactions on Smart Grid.

[14]  Anthony Skjellum,et al.  Transforming blocking MPI collectives to Non-blocking and persistent operations , 2017, EuroMPI/USA.

[15]  Anthony Skjellum,et al.  Planning for performance: persistent collective operations for MPI , 2017, EuroMPI/USA.

[16]  Anthony Skjellum,et al.  Design and Evaluation of FA-MPI, a Transactional Resilience Scheme for Non-blocking MPI , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[17]  Ahmed H Sameh Parallel Sparse Linear System and Eigenvalue Problem Solvers: From Multicore to Petascale Computing , 2015 .

[18]  Michael Klemm,et al.  A Pattern for Overlapping Communication and Computation with OpenMP ^* Target Directives , 2017, IWOMP.

[19]  Vivek Sarkar,et al.  DOE Advanced Scientific Computing Advisory Committee (ASCAC) Report: Exascale Computing Initiative Review , 2015 .

[20]  Canqun Yang,et al.  MilkyWay-2 supercomputer: system and application , 2014, Frontiers of Computer Science.

[21]  Paul E. Ceruzzi,et al.  Moore's Law and Technological Determinism: Reflections on the History of Technology , 2005 .

[22]  Victor W. Lee,et al.  Fast Sort on CPUs , GPUs and Intel MIC Architectures , 2010 .

[23]  John Shalf,et al.  Exascale Computing Technology Challenges , 2010, VECPAR.

[24]  Erik H. D'Hollander,et al.  Transition of Hpc Towards Exascale Computing , 2013, ParCo 2013.

[25]  John Shalf,et al.  Overlapping Data Transfers with Computation on GPU with Tiles , 2017, 2017 46th International Conference on Parallel Processing (ICPP).

[26]  Pete Beckman,et al.  Argo: An Exascale Operating System and Runtime , 2015 .