PCS: A Productive Computational Science Platform

As modern supercomputers continue to be increasingly heterogeneous with diverse computational accelerators (graphics processing units (GPUs), fieldprogrammable gate arrays (FPGAs), application specific integrated circuits (ASICs), etc.), software becomes a critical design aspect. Exploiting this new computational power requires increased software design time and effort to make valuable scientific discovery in the face of the complicated programming environments introduced by these accelerators. To address these challenges, we propose unifying multiple programming models into a single programming environment to facilitate large-scale, accelerator-aware, heterogeneous computing for next-generation scientific applications. This paper presents PCS, a productive computational science platform for cluster-scale heterogeneous computing. Focusing FPGAs, we describe the key concepts of the PCS platform and differentiate PCS from the current state-of-the-art, propose a new multi-FPGA architecture for graph-centric workloads (e.g., deep learning, etc.) with discussions on ongoing work.

[1]  R. Sah The Advanced Light Source , 1983, IEEE Transactions on Nuclear Science.

[2]  Leslie G. Valiant,et al.  Bulk synchronous parallel computing-a paradigm for transportable software , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[3]  Seth Copen Goldstein,et al.  Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.

[4]  David E. Shaw,et al.  Anton: A specialized ASIC for molecular dynamics , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[5]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[6]  Torsten Hoefler,et al.  AM++: A generalized active message framework , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[7]  Katherine A. Yelick,et al.  Hybrid PGAS runtime support for multicore nodes , 2010, PGAS '10.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Ruediger Willenberg,et al.  A Heterogeneous GASNet Implementation for FPGA-accelerated Computing , 2014, PGAS.

[10]  Katherine A. Yelick,et al.  UPC++: A PGAS Extension for C++ , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[11]  Per Andersen,et al.  An Overlap Study for Cluster Computing , 2015, 2015 International Conference on Computational Science and Computational Intelligence (CSCI).

[12]  Wayne Luk,et al.  Parallel Genetic Algorithms on Multiple FPGAs , 2016, CARN.

[13]  Toshio Endo,et al.  PGAS Communication Runtime for Extreme Large Data Computation , 2016, 2016 Second International Workshop on Extreme Scale Programming Models and Middlewar (ESPM2).

[14]  Dimitrios Katramatos,et al.  Application of Analysis on the Wire to Streaming NSLS-II Beamline Data , 2018, 2018 New York Scientific Data Summit (NYSDS).

[15]  James B. Aimone,et al.  Neural algorithms and computing beyond Moore's law , 2019, Commun. ACM.