Conceptualizing a Computing Platform for Science Beyond 2020: To Cloudify HPC, or HPCify Clouds?

A primary challenge of the cyberinfrastructure research community is the need to define the Platforms for Science beyond 2020. We analyze major current trends and propose that in order to deliver the Platform for Science in 2020 the dominant research challenge is to manage the convergence of capabilities of traditional HPC systems with richness of Apache Big Data systems. In this vision paper, we purport to examine the relationship between infrastructure for data-intensive computing and that for High Performance Computing and examine possible "convergence" of capabilities.