The SAGE project: a storage centric approach for exascale computing: invited paper

SAGE (Percipient StorAGe for Exascale Data Centric Computing) is a European Commission funded project towards the era of Exascale computing. Its goal is to design and implement a Big Data/Extreme Computing (BDEC) capable infrastructure with associated software stack. The SAGE system follows a storage centric approach as it is capable of storing and processing large data volumes at the Exascale regime. SAGE addresses the convergence of Big Data Analysis and HPC in an era of next-generation data centric computing. This convergence is driven by the proliferation of massive data sources, such as large, dispersed scientific instruments and sensors where data needs to be processed, analyzed and integrated into simulations to derive scientific and innovative insights. A first prototype of the SAGE system has been been implemented and installed at the Jülich Supercomputing Center. The SAGE storage system consists of multiple types of storage device technologies in a multi-tier I/O hierarchy, including flash, disk, and non-volatile memory technologies. The main SAGE software component is the Seagate Mero Object Storage that is accessible via the Clovis API and higher level interfaces. The SAGE project also includes scientific applications for the validation of the SAGE concepts. The objective of this paper is to present the SAGE project concepts, the prototype of the SAGE platform and discuss the software architecture of the SAGE system.

[1]  Katherine Bourzac,et al.  Has Intel created a universal memory technology? [News] , 2017 .

[2]  Jeffrey S. Vetter,et al.  HPC Interconnection Networks: The Key to Exascale Computing , 2008, High Performance Computing Workshop.

[3]  Andrew J. Hutton,et al.  Lustre: Building a File System for 1,000-node Clusters , 2003 .

[4]  Nikita Danilov Mero : Co-Designing an Object Store for Extreme Scale , 2016 .

[5]  I. Lupelli The Efit++ Equilibrium Code: Recent Upgrades And Applications To Air-Core And Iron- Core Machines , 2015 .

[6]  Bianca Schroeder,et al.  A Large-Scale Study of Failures in High-Performance Computing Systems , 2006, IEEE Transactions on Dependable and Secure Computing.

[7]  Gokcen Kestor,et al.  Preparing HPC Applications for the Exascale Era: A Decoupling Strategy , 2017, 2017 46th International Conference on Parallel Processing (ICPP).

[8]  Mark Basham,et al.  Savu: A Python-based, MPI Framework for Simultaneous Processing of Multiple, N-dimensional, Large Tomography Datasets , 2016, ArXiv.

[9]  Gokcen Kestor,et al.  A Performance Characterization of Streaming Computing on Supercomputers , 2016, ICCS.

[10]  Erwin Laure,et al.  The Formation of a Magnetosphere with Implicit Particle-in-Cell Simulations , 2015, ICCS.

[11]  Stefano Markidis,et al.  Multi-scale simulations of plasma with iPIC3D , 2010, Math. Comput. Simul..

[12]  Erwin Laure,et al.  A data streaming model in MPI , 2015, ExaMPI '15.

[13]  Gokcen Kestor,et al.  Exploring the Performance Benefit of Hybrid Memory System on HPC Environments , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[14]  Robert B. Ross,et al.  On the role of burst buffers in leadership-class storage systems , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[15]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[16]  Marc-Oliver Gewaltig,et al.  NEST (NEural Simulation Tool) , 2007, Scholarpedia.

[17]  Sai Narasimhamurthy,et al.  SAGE: Percipient Storage for Exascale Data Centric Computing , 2018, Parallel Comput..

[18]  Erwin Laure,et al.  Energetic particles in magnetotail reconnection , 2014, Journal of Plasma Physics.

[19]  Jack J. Dongarra,et al.  Exascale computing and big data , 2015, Commun. ACM.

[20]  Gokcen Kestor,et al.  Extending Message Passing Interface Windows to Storage , 2017, 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[21]  F. Raymond,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Ray Meta: scalable de novo metagenome assembly and profiling , 2012 .

[22]  Martin Riese,et al.  Scattering in infrared radiative transfer: A comparison between the spectrally averaging model JURASSIC and the line-by-line model KOPRA , 2013 .

[23]  Gokcen Kestor,et al.  Exploring Application Performance on Emerging Hybrid-Memory Supercomputers , 2016, 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[24]  Sai Narasimhamurthy,et al.  MPI windows on storage for HPC applications , 2017, EuroMPI/USA.

[25]  Sandeep K. S. Gupta,et al.  DASH: a Recipe for a Flash-based Data Intensive Supercomputer , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.