Lessons Learned from a Decade of Providing Interactive, On-Demand High Performance Computing to Scientists and Engineers

For decades, the use of HPC systems was limited to those in the physical sciences who had mastered their domain in conjunction with a deep understanding of HPC architectures and algorithms. During these same decades, consumer computing device advances produced tablets and smartphones that allow millions of children to interactively develop and share code projects across the globe. As the HPC community faces the challenges associated with guiding researchers from disciplines using high productivity interactive tools to effective use of HPC systems, it seems appropriate to revisit the assumptions surrounding the necessary skills required for access to large computational systems. For over a decade, MIT Lincoln Laboratory has been supporting interactive, on-demand high performance computing by seamlessly integrating familiar high productivity tools to provide users with an increased number of design turns, rapid prototyping capability, and faster time to insight. In this paper, we discuss the lessons learned while supporting interactive, on-demand high performance computing from the perspectives of the users and the team supporting the users and the system. Building on these lessons, we present an overview of current needs and the technical solutions we are building to lower the barrier to entry for new users from the humanities, social, and biological sciences.

[1]  Jeremy Kepner,et al.  Scalability of VM provisioning systems , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[2]  Jeremy Kepner,et al.  Achieving 100,000,000 database inserts per second using Accumulo and D4M , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[3]  Jeremy Kepner,et al.  MIT SuperCloud portal workspace: Enabling HPC web application deployment , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[4]  Jeremy Kepner,et al.  Enabling on-demand database computing with MIT SuperCloud database management system , 2015, 2015 IEEE High Performance Extreme Computing Conference (HPEC).

[5]  Jeremy Kepner,et al.  Dynamic distributed dimensional data model (D4M) database and computation system , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Jeremy Kepner Parallel MATLAB - for Multicore and Multinode Computers , 2009, Software, environments, tools.

[7]  Jeremy Kepner,et al.  D4M 2.0 schema: A general purpose high performance schema for the Accumulo database , 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[8]  Jeremy Kepner,et al.  Big Data strategies for Data Center Infrastructure management using a 3D gaming platform , 2015, 2015 IEEE High Performance Extreme Computing Conference (HPEC).

[9]  Jeremy Kepner,et al.  Driving big data with big compute , 2012, 2012 IEEE Conference on High Performance Extreme Computing.

[10]  Hyung Seok Kim,et al.  Interactive Grid Computing at Lincoln Laboratory , 2006 .

[11]  J. Kepner,et al.  Technology Requirements for Supporting On-Demand Interactive Grid Computing , 2005, 2005 Users Group Conference (DOD-UGC'05).

[12]  Alan Edelman The Star-P High Performance Computing Platform , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[13]  Jeremy Kepner,et al.  LLSuperCloud: Sharing HPC systems for diverse rapid prototyping , 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[14]  Jeremy Kepner,et al.  pMATLAB: Parallel MATLAB Library for Signal Processing Applications , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[15]  Jeremy Kepner,et al.  HPC-VMs: Virtual machines in high performance computing systems , 2012, 2012 IEEE Conference on High Performance Extreme Computing.

[16]  Peter J. Denning,et al.  Exponential laws of computing growth , 2016, Commun. ACM.

[17]  Jeremy Kepner,et al.  MatlabMPI , 2004, J. Parallel Distributed Comput..

[18]  Robert K. Cunningham,et al.  Computing on masked data: a high performance method for improving big data veracity , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[19]  Henry Hoffmann,et al.  Parallel VSIPL++: An Open Standard Software Library for High-Performance Parallel Signal Processing , 2005, Proceedings of the IEEE.

[20]  Jeremy Kepner,et al.  D4M: Bringing associative arrays to database engines , 2015, 2015 IEEE High Performance Extreme Computing Conference (HPEC).

[21]  Hahn Kim,et al.  Technical Challenges of Supporting Interactive HPC , 2007, 2007 DoD High Performance Computing Modernization Program Users Group Conference.

[22]  Greg Wilson,et al.  Software Carpentry: Getting Scientists to Write Better Code by Making Them More Productive , 2006, Computing in Science & Engineering.

[23]  Jeremy Kepner,et al.  Learning by doing, High Performance Computing education in the MOOC era , 2017, J. Parallel Distributed Comput..