The Italian research on HPC key technologies across EuroHPC

High-Performance Computing (HPC) is one of the strategic priorities for research and innovation worldwide due to its relevance for industrial and scientific applications. We envision HPC as composed of three pillars: infrastructures, applications, and key technologies and tools. While infrastructures are by construction centralized in large-scale HPC centers, and applications are generally within the purview of domain-specific organizations, key technologies fall in an intermediate case where coordination is needed, but design and development are often decentralized. A large group of Italian researchers has started a dedicated laboratory within the National Interuniversity Consortium for Informatics (CINI) to address this challenge. The laboratory, albeit young, has managed to succeed in its first attempts to propose a coordinated approach to HPC research within the EuroHPC Joint Undertaking, participating in the calls 2019--20 to five successful proposals for an aggregate total cost of 95M€. In this paper, we outline the working group's scope and goals and provide an overview of the five funded projects, which become fully operational in March 2021, and cover a selection of key technologies provided by the working group partners, highlighting their usage development within the projects.

[1]  Sergio Saponara,et al.  Novel Arithmetics in Deep Neural Networks Signal Processing for Autonomous Driving: Challenges and Opportunities , 2021, IEEE Signal Processing Magazine.

[2]  G. Scotti,et al.  Klessydra-T: Designing Vector Coprocessors for Multithreaded Edge-Computing Cores , 2020, IEEE Micro.

[3]  Ivan Merelli,et al.  StreamFlow: Cross-Breeding Cloud With HPC , 2020, IEEE Transactions on Emerging Topics in Computing.

[4]  Luca Benini,et al.  COUNTDOWN: A Run-Time Library for Performance-Neutral Energy Saving in MPI Applications , 2018, IEEE Transactions on Computers.

[5]  HPC Application Cloudification: The StreamFlow Toolkit , 2021 .

[6]  Thomas Fahringer,et al.  SYCL-Bench: A Versatile Cross-Platform Benchmark Suite for Heterogeneous Computing , 2020, Euro-Par.

[7]  Alessandro Cilardo,et al.  The RECIPE approach to challenges in deeply heterogeneous high performance systems , 2020, Microprocess. Microsystems.

[8]  Giovanni Agosta,et al.  Dynamic Precision Autotuning with TAFFO , 2020, ACM Trans. Archit. Code Optim..

[9]  Giovanni Agosta,et al.  TAFFO: Tuning Assistant for Floating to Fixed Point Optimization , 2020, IEEE Embedded Systems Letters.

[10]  Luca Benini,et al.  Countdown Slack: A Run-Time Library to Reduce Energy Footprint in Large-Scale MPI Applications , 2019, IEEE Transactions on Parallel and Distributed Systems.

[11]  Angelo Riccio,et al.  Coastal Marine Data Crowdsourcing Using the Internet of Floating Things: Improving the Results of a Water Quality Model , 2020, IEEE Access.

[12]  Raffaele Montella,et al.  A Microservice-Based Building Block Approach for Scientific Workflow Engines: Processing Large Data Volumes with DagOnStar , 2019, 2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS).

[13]  Sokol Kosta,et al.  CUDA Virtualization and Remoting for GPGPU Based Acceleration Offloading at the Edge , 2019, IDCS.

[14]  Thomas Fahringer,et al.  Celerity: High-Level C++ for Accelerator Clusters , 2019, Euro-Par.

[15]  José Luis González,et al.  Internet of Things orchestration using DagOn* workflow engine , 2019, 2019 IEEE 5th World Forum on Internet of Things (WF-IoT).

[16]  Luca Benini,et al.  Pricing schemes for energy-efficient HPC systems: Design and exploration , 2018, Int. J. High Perform. Comput. Appl..

[17]  Sokol Kosta,et al.  DagOn*: Executing Direct Acyclic Graphs as Parallel Jobs on Anything , 2018, 2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS).

[18]  Giovanni Agosta,et al.  Managing Heterogeneous Resources in HPC Systems , 2018, PARMA-DITAM '18.

[19]  Giovanni Agosta,et al.  libVersioningCompiler: An easy-to-use library for dynamic generation and invocation of multiple code versions , 2018, SoftwareX.

[20]  Luca Benini,et al.  The ANTAREX tool flow for monitoring and autotuning energy efficient HPC systems , 2017, 2017 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).

[21]  Marco Danelutto,et al.  FastFlow: High-level and Efficient Streaming on Multi-core , 2017 .

[22]  Cheol-Ho Hong,et al.  On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework , 2017, International Journal of Parallel Programming.

[23]  Luca Benini,et al.  Autotuning and adaptivity approach for energy efficient Exascale HPC systems: The ANTAREX approach , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[24]  Alessandro Cilardo,et al.  Enabling HPC for QoS-sensitive applications: The MANGO approach , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[25]  Ian T. Foster,et al.  WaComM: A Parallel Water Quality Community Model for Pollutant Transport and Dispersion Operational Predictions , 2016, 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS).

[26]  Roberto Giorgi,et al.  A scalable thread scheduling co-processor based on data-flow principles , 2015, Future Gener. Comput. Syst..

[27]  Avi Mendelson,et al.  Architectural Support for Fault Tolerance in a Teradevice Dataflow System , 2014, International Journal of Parallel Programming.

[28]  Paolo Faraboschi,et al.  An Introduction to DF-Threads and their Execution Model , 2014, 2014 International Symposium on Computer Architecture and High Performance Computing Workshop.

[29]  Fatos Xhafa,et al.  Programming multi-core and many-core computing systems , 2014 .

[30]  Raffaele Montella,et al.  The High Performance Internet of Things: Using GVirtuS to Share High-End GPUs with ARM Based Cluster Computing Nodes , 2013, PPAM.

[31]  Thomas Fahringer,et al.  LibWater: heterogeneous distributed computing made easy , 2013, ICS '13.