ESCAPE aims to address the challenges of open science. In the first year of the project, a prototype of an open access data lake for the ESCAPE community has been set up: the ESCAPE Data Infrastructure for Open Science (DIOS). This tool will bring together data from different communities on a robust infrastructure that will appear uniform to users but will pool storage infrastructures distributed across Europe. Access to this data will also need to be transparent and robust for users.
A first simple exercise, consisting of re-analysing the data that led to the discovery of the Higgs boson, was carried out using public data from CERN's ATLAS experiment. It successfully passed through the path defined by the different components of the ESCAPE infrastructure. The preparatory step consisted in downloading the data from a public web server and then using the RUCIO data management service which provides tools to transfer the files into the data lake and expose their path to users. The second phase involved the use of an authentication and authorization services to access the data, and finally process the data to select events and reconstruct an excess of data in the two-photon invariant mass distribution.
This first experiment ensured that the small-scale ESCAPE data lake, on the order of gigabytes, worked well. It will then be necessary to confirm this positive result with data from other experiments, in particular CTA and LSST, but also to carry out the operation on a terabyte scale.