February 28, 2021

Since the start of the ESCAPE project, WP2 (Data Infrastructure for Open Science - DIOS) has come a long way in its goal to open up federated, distributed data management technologies to the broad science communities of the project.

We have built a pilot Data Lake and deployed all the supporting services required to make this work, with not just the three storage locations targeted in our plan, but with ten sites active across ESCAPE partner countries. At least as importantly as the physical and digital infrastructures, we have built a community - onboarding users from across ESCAPE's Astrophysics, Astro-particle and Particle Physics science experiments and participating institutes, engaging them not only in the assessment of the
performance of the Data Lake but also in the day-to-day operational aspects. Some of these users are completely new to the elements of the ESCAPE DIOS technology stack, whilst others have significant experience in the specific context (especially in High Energy Physics). Some experiments have existing systems operational and are seeking ways to improve automation of their data management, whilst others are looking ahead at large-scale data challenges and asking if and how the open-source technologies presented in the ESCAPE project can answer their needs. Our ESCAPE Data Lake is relatively modest in scale by comparison to the enormous data volumes
managed in production in High-Energy Physics data catalogues (e.g. ATLAS and CMS). Precisely because we are not running a production instance, we have been able to try challenging tests (which do sometimes break things) and to reconfigure or rebuild our services as we learn.

In preparation for our assessment period (Nov-Dec 2020), we built up our continuous testing and monitoring work - we have working suites of tests running continuously that give us feedback on the health of most areas of the technology stack. We developed an operations team from across sites and experiments which shares the daily health check rota and meets weekly to check status of issues. The ESCAPE Data Lake is available continuously to all users, but we decided that having a focused
testing time would help to ensure good user support and put more strain on the central Data Lake services, so more closely mimicking real-life scenarios. Our first testing window took place on 17th November, and we held a work-package wide workshop on 9-10 December to present the results and identify lessons learned. Some experiments then continued their testing during the second testing window on 15th December. This saw over a dozen different tests developed by teams
representing ESCAPE Experiments over the course of both testing windows. Through the broad user community in WP2, we have been able to undertake an assessment of the Data Lake that has provided genuinely useful feedback - for our own benefit but also for the developers of the underlying technologies. 

In this document we describe our pilot Data Lake and present both the continuous monitoring and the experiment-led tests in detail. The context for each experiment is interesting too - because it gives understanding of the Data Lake features of interest to each experiment - we have included several context examples in an appendix to this document (page 46 onwards).

The process of hardening our pilot Data Lake into something more mature has certainly been accelerated by this testing work, and it will continue as we move to the "Prototype" phase of ESCAPE - the results of the assessment presented in this document are our starting point for that future work. Overall, collaborators in this work package have been positively impressed by the performance of the Data Lake, and of the resilience shown by Rucio in particular. The interaction between ESCAPE
WP2 colleagues has been excellent too - forming a coherent team, with members comfortable asking questions about technologies unfamiliar to them, has been a great success - especially when travel restrictions have made it impossible to meet in person. For the first time, we have seen our flagship ESFRIs in Astro-particle Physics, Electromagnetic and Gravitational Wave Astronomy, Particle Physics, and Nuclear Physics working together in a common scientific data management infrastructure – the ESCAPE Data Lake – that seeks to implement the standards of data FAIRness and Open Access. This is an achievement worth highlighting; we are breaking historic sociological barriers between these different scientific communities. We are both searching for commonalities and understanding differences in their computing models with a view to successfully addressing the plans for these challenging experiments in the next decade, especially in data management and computing. The results of the assessment work that we present here give us confidence that the Data Lake concept, and the tools we are testing, should be able to scale to meet foreseen future needs of theResearch Infrastructures represented in ESCAPE, in terms of number of sites, storage volumes, data rates and number of users.

 

Deliverable submitted but pending European Commission approval.