Using AWS EC2 as Test-Bed infrastructure in the I/O system configuration for HPC applications

In recent years, the use of public cloud platforms as infrastructure has been gaining popularity in many scientific areas and High Performance Computing (HPC) is no exception. These kinds of platforms can be used by system administrators as Test-Bed systems for evaluating and detecting performance inefficiencies in the I/O subsystem, and for taking decisions about the configuration parameters that have influence on the performance of an application, without compromising the performance of the production HPC system. In this paper, we propose a methodology to evaluate parallel applications by using virtual clusters as a test system. Our experimental validation indicates that virtual clusters are a quick and easy solution for system administrators, for analyzing the impact of the I/O system on the I/O kernels of the parallel applications and for taking performance decisions in a controlled environment.

[1]  P. Pevzner,et al.  An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Rajkumar Buyya,et al.  High-Performance Cloud Computing: A View of Scientific Applications , 2009, 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks.

[3]  Robert Latham,et al.  Understanding and improving computational science storage access through continuous characterization , 2011, MSST.

[4]  Wenguang Chen,et al.  One optimized I/O configuration per HPC application: leveraging the configurability of cloud , 2011, APSys.

[5]  Rajeev Thakur,et al.  Achievements and challenges for I/O in computational science , 2005 .

[6]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[7]  Herodotos Herodotou,et al.  No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics , 2011, SoCC.

[8]  G. Bruce Berriman,et al.  Data Sharing Options for Scientific Workflows on Amazon EC2 , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[9]  Wenguang Chen,et al.  Automatic Cloud I/O Configurator for I/O Intensive Parallel Applications , 2015, IEEE Transactions on Parallel and Distributed Systems.

[10]  Sabela Ramos,et al.  Analysis of I/O Performance on an Amazon EC2 Cluster Compute and High I/O Platform , 2013, Journal of Grid Computing.