Optimization of High Performance Computing Cluster based on Intel MIC

This paper focuses on theoretical analysis, computational test and optimization of High Performance Computing Cluster (HPCC). Also known as Data Analytics Supercomputer (DAS), HPCC is built on Intel Many Integrated Core (MIC) for High Performance Linpack (HPL) test. Initially, a platform is configured by 5 nodes with an Intel® Xeon Phi™ Coprocessor 31S1P per node to analyze power consumption as well as parallel level of Intel MIC accelerator. Compiling with Message Passing Interface (MPI) library and Math Kernel Library (MKL), the “Make” file is modified and debugged by adjusting parameters in hpccinf.txt. According to multiple Infiniband nodes evaluation, the libhpl library of Intel is employed and value of NB is set to be 960 for a single node with one MIC while debugging. Moreover, this paper optimized the algorithm on Double Precision General Matrix Multiplication (DGEMM) test and PTRANS to acquire efficient, precise and truncated working time duration. From HPL and Gridding Program experimental results, it's clear that the theoretical analysis and experiments were performed successfully.

[1]  Ingo Wald,et al.  Fast Construction of SAH BVHs on the Intel Many Integrated Core (MIC) Architecture , 2012, IEEE Transactions on Visualization and Computer Graphics.

[2]  Kai Xu,et al.  A hybrid solution method for CFD applications on GPU-accelerated hybrid HPC platforms , 2016, Future Gener. Comput. Syst..

[3]  J Donaghue,et al.  SU-E-T-473: Improvements of GPU Based Calculations Over CPU Based Calculations and the Effects of Variable Computer Hardware , 2015 .

[4]  Sulamita Klein,et al.  List matrix partitions of chordal graphs , 2005, Theor. Comput. Sci..

[5]  Ulrich Rüde,et al.  Lehrstuhl Für Informatik 10 (systemsimulation) Walberla: Hpc Software Design for Computational Engineering Simulations Walberla: Hpc Software Design for Computational Engineering Simulations , 2010 .

[6]  Jack J. Dongarra,et al.  A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..

[7]  S. F. Ashby,et al.  On the role of high performance computing for simulating subsurface flow and chemical migration , 1993 .

[8]  Volker Lindenstruth,et al.  A Flexible and Portable Large-Scale DGEMM Library for Linpack on Next-Generation Multi-GPU Systems , 2015, 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[9]  Christopher Stewart,et al.  Operational Analysis of Parallel Servers , 2008, 2008 IEEE International Symposium on Modeling, Analysis and Simulation of Computers and Telecommunication Systems.

[10]  Jie Liu,et al.  Accelerating embarrassingly parallel algorithm on Intel MIC , 2014, 2014 IEEE International Conference on Progress in Informatics and Computing.

[11]  Pawel Gepner,et al.  Evaluation of DGEMM Implementation on Intel Xeon Phi Coprocessor , 2014, J. Comput..

[12]  Wei Huang,et al.  Enabling Large-Scale Biomolecular Conformation Search with Replica Exchange Statistical Temperature Molecular Dynamics (RESTMD) over HPC and Cloud Computing Resources , 2015, 2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops.

[13]  Wu-chun Feng,et al.  Delivering Parallel Programmability to the Masses via the Intel MIC Ecosystem: A Case Study , 2014, 2014 43rd International Conference on Parallel Processing Workshops.

[14]  Bormin Huang,et al.  Optimizing Weather and Research Forecast (WRF) Thompson cloud microphysics on Intel Many Integrated Core (MIC) , 2014, Sensing Technologies + Applications.

[15]  T. Tezduyar,et al.  A new strategy for finite element computations involving moving boundaries and interfaces—the deforming-spatial-domain/space-time procedure. I: The concept and the preliminary numerical tests , 1992 .

[16]  Bin Wang,et al.  Study on the usage of virtual resource management technologies in meteorological computational Grid , 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE).

[17]  Yogish Sabharwal,et al.  Performance evaluation and optimization of random memory access on multicores with high productivity , 2010, 2010 International Conference on High Performance Computing.

[18]  Massimo Bernaschi,et al.  Multi-Kepler GPU vs. multi-Intel MIC for spin systems simulations , 2014, Comput. Phys. Commun..

[19]  K Y Sanbonmatsu,et al.  High performance computing in biology: multimillion atom simulations of nanoscale systems. , 2007, Journal of structural biology.

[20]  Alice Good,et al.  A Case Study on Data Protection and Security Decisions in Cloud HPC , 2015, 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom).

[21]  Peter Dzwig,et al.  Application of HPC to Medium-Size Stochastic Systems with Non-Linear Constraints in Finance , 1998, HPCN Europe.

[22]  Matthias S. Müller,et al.  Network Bandwidth Measurements and Ratio Analysis with the HPC Challenge Benchmark Suite (HPCC) , 2005, PVM/MPI.