Building a Private HPC Cloud for Compute and Data-Intensive Applications

Traditional HPC (High Performance Computing) clusters are best suited for well-formed calculations. The orderly batch-oriented HPC cluster offers maximal potential for performance per application, but limits resource efficiency and user flexibility. An HPC cloud can host multiple virtual HPC clusters, giving the scientists unprecedented flexibility for research and development. With the proper incentive model, resource efficiency will be automatically maximized. In this context, there are three new challenges. The first is the virtualization overheads. The second is the administrative complexity for scientists to manage the virtual clusters. The third is the programming model. The existing HPC programming models were designed for dedicated homogeneous parallel processors. The HPC cloud is typically heterogeneous and shared. This paper reports on the practice and experiences in building a private HPC cloud using a subset of a traditional HPC cluster. We report our evaluation criteria using Open Source software, and performance studies for compute-intensive and data-intensive applications. We also report the design and implementation of a Puppet-based virtual cluster administration tool called HPCFY. In addition, we show that even if the overhead of virtualization is present, efficient scalability for virtual clusters can be achieved by understanding the effects of virtualization overheads on various types of HPC and Big Data workloads. We aim at providing a detailed experience report to the HPC community, to ease the process of building a private HPC cloud using Open Source software.