Reference Exascale Architecture

While political commitments for building exascale systems have been made, turning these systems into platforms for a wide range of exascale applications faces several technical, organisational and skills-related challenges. The key technical challenges are related to the availability of data. While the first exascale machines are likely to be built within a single site, the input data is in many cases impossible to store within a single site. Alongside handling of extreme-large amount of data, the exascale system has to process data from different sources, support accelerated computing, handle high volume of requests per day, minimize the size of data flows, and be extensible in terms of continuously increasing data as well as increase in parallel requests being sent. These technical challenges are addressed by the general reference exascale architecture. It is divided into three main blocks: virtualization layer, distributed virtual file system, and manager of computing resources. Its main property is modularity which is achieved by containerization at two levels: 1) application containers - containerization of scientific workflows, 2) micro-infrastructure - containerization of extreme-large data service-oriented infrastructure. The paper also presents an instantiation of the reference architecture - the architecture of the PROCESS project (PROviding Computing solutions for ExaScale ChallengeS) and discuss its relation to the reference exascale architecture. The PROCESS architecture has been used as an exascale platform within various exascale pilot applications. This work will present the requirements and the derived architecture as well as the 5 use cases pilots that it made possible.

[1]  T. J. Dijkema,et al.  The LOFAR Two-metre Sky Survey. I. Survey description and preliminary data release , 2016, 1611.02700.

[2]  Marian Bubak,et al.  Exascale computing and data architectures for brownfield applications , 2018, 2018 14th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).

[3]  Rajkumar Buyya,et al.  The anatomy of big data computing , 2015, Softw. Pract. Exp..

[4]  Katarzyna Rycerz,et al.  Heterogeneous Exascale Computing , 2019, Recent Advances in Intelligent Engineering.

[5]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[6]  Henning Müller,et al.  Analysis of Histopathology Images , 2017 .

[7]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Manfredo Atzori,et al.  Chapter 10 – Analysis of Histopathology Images: From Traditional Machine Learning to Deep Learning , 2017 .

[10]  Jae-Gil Lee,et al.  Geospatial Big Data: Challenges and Opportunities , 2015, Big Data Res..