The MAX IV Laboratory Scientific Data Management

The Scientific Data Management is a key aspect of the IT system of a user research facility like the MAX IV Laboratory. By definition, this system handles data produced by the experimental user of such a facility. It could be perceived as easy as using an external hard drive to store the experimental data to carry back to the home institute for analysis. But on the other hand the "data" can be seen as more than just a file in a directory and the "management" not only a copy operation. Simplicity and a good User Experience vs security/authentication and reliability are among the main challenges of this project along with all the mindset changes. This article will explain all the concepts and the basic rollout of the system at the MAX IV Laboratory for the first users and the features anticipated in the future (Fig. 1). DO RESEARCHERS NEED SUPPORT TO MANAGE THEIR DATA? In short yes, there is a growing consensus spanning the European commission, research infrastructures, higher education and research groups that services need to be improved for researchers. While being owners of the data, they are not able to dedicate enough focus to data management while the amount of data generated is rapidly growing. Within facilities such as MAX IV Laboratory, the handling of users is split into many groups which handle different parts of the workflow. This results in visiting researchers needing to access many internal tools and systems as they follow the facility experimental visit workflow from proposing to do an experiment, planning a visit, following safety procedures and using the beam line computer systems, data storage and computation and finally obtaining data and results. Without any internal integration of these systems, the user is normally obliged to carry information (login details, experimental context and history of the events) between different systems, in their heads, on paper or in their own electronic log formats. They must try to keep a track of the relevant information needed at each step and be able to take what is needed away from the facility. The Scientific Data Management (SDM) is designed to create links across all systems which permits each component to dialogue information without the researcher having to do it. The result is that each system can implement a smoother, less detailed workflow, increasing effectiveness while more easily allowing more complexity to be added, such as meta-data acquisition during the experiment, automated data processing workflows and even automated migration to destined remote data centres. It is important that the SDM workflows appear simple and reliable to build trust in using them, instead of the current trust in using portable physical disks as a way of transferring data which makes it virtually untrackable once it leaves the facility. In addition, the new high data volume experiments are too much data for this method to be feasible. Figure 1: The MAX IV imaging concept. [1] SCIENTIFIC DATA MANAGEMENT