The Virtual Research Environment: towards a comprehensive analysis platform

The Virtual Research Environment is an analysis platform developed at CERN serving the needs of scientific communities involved in European Projects. Its scope is to facilitate the development of end-to-end physics workflows, providing researchers with access to an infrastructure and to the digital content necessary to produce and preserve a scientific result in compliance with FAIR principles. The platform's development is aimed at demonstrating how sciences spanning from High Energy Physics to Astrophysics could benefit from the usage of common technologies, initially born to satisfy CERN's exabyte-scale data management needs. The Virtual Research Environment's main components are (1) a federated distributed storage solution (the Data Lake), providing functionalities for data injection and replication through a Data Management framework (Rucio), (2) a computing cluster supplying the processing power to run full analyses with Reana, a re-analysis software, (3) a federated and reliable Authentication and Authorization layer and (4) an enhanced notebook interface with containerised environments to hide the infrastructure's complexity from the user. The deployment of the Virtual Research Environment is open-source and modular, in order to make it easily reproducible by partner institutions; it is publicly accessible and kept up to date by taking advantage of state of the art IT-infrastructure technologies.

[1]  Harri Hirvonsalo,et al.  REANA: A System for Reusable Research Data Analyses , 2019, EPJ Web of Conferences.

[2]  Farid Ould-Saada,et al.  Rucio: Scientific Data Management , 2019, Computing and Software for Big Science.

[3]  K. Cranmer,et al.  Open is not enough , 2018, Nature Physics.

[4]  D. Bacon,et al.  Fundamental physics with the Square Kilometre Array , 2018, Publications of the Astronomical Society of Australia.

[5]  Enrico Vianello,et al.  The INDIGO-Datacloud Authentication and Authorization Infrastructure , 2017 .

[6]  M. Baker 1,500 scientists lift the lid on reproducibility , 2016, Nature.

[7]  Andreas J. Peters,et al.  EOS as the present and future solution for data storage at CERN , 2015 .

[8]  J. Casandjian The Fermi-LAT model of interstellar emission for standard point source analysis , 2015, 1502.07210.

[9]  C. Broeck,et al.  Advanced Virgo: a second-generation interferometric gravitational wave detector , 2014, 1408.3978.

[10]  A. Margiotta The KM3NeT deep-sea neutrino telescope , 2014, 1408.1392.

[11]  M. Bossa DarkSide-50, a background free experiment for dark matter searches , 2014 .

[12]  Ricardo Rocha,et al.  DPM: Future Proof Storage , 2012 .

[13]  Stefano Dal Pra,et al.  StoRMon: an event log analyzer for Grid Storage Element based on StoRM , 2011 .

[14]  V. Golev,et al.  Design concepts for the Cherenkov Telescope Array CTA: an advanced facility for ground-based high-energy gamma-ray astronomy , 2011 .

[15]  M. C. Toribio,et al.  LOFAR: The LOw-Frequency ARray , 2013, 1305.3550.

[16]  A. Goshaw The ATLAS Experiment at the CERN Large Hadron Collider , 2008 .

[17]  Patrick Fuhrmann,et al.  dCache, Storage System for the Future , 2006, Euro-Par.

[18]  Joshua R. Smith,et al.  LIGO: the Laser Interferometer Gravitational-Wave Observatory , 1992, Science.

[19]  Dilia Maria,et al.  ESCAPE Data Lake-Next-generation management of cross-discipline Exabyte-scale scientific data , 2021 .

[20]  Andrea Ceccanti,et al.  ESCAPE prototypes a data infrastructure for open science , 2020, EPJ Web of Conferences.

[21]  Edward Karavakis,et al.  FTS improvements for LHC Run-3 and beyond , 2020, EPJ Web of Conferences.

[22]  Mubdi Rahman,et al.  Probing Diverse Phenomena through Data-Intensive Astronomy , 2022 .