ESCAPE DIOS Workshop: A Full Dress Rehearsal Exercise of the ESCAPE pilot Data Lake
24 February 2021

Between 9 and 10 December 2021, ESCAPE DIOS team, which is behind the development of the ESCAPE DIOS (Data Infrastructure for Open Science) service, organised the second edition of ESCAPE DIOS Workshop. This 2-days workshop brought together 9 ESFRIs (European Strategy Forum on Research Infrastructures) to share with ESCAPE DIOS team the results of tests performed in the ESCAPE pilot Data Lake, which were done during the full dress rehearsal that took place on 17th November 2020.

A common Data Lake infrastructure for serve global user communities

The ESCAPE DIOS team aims to deliver a prototype of an infrastructure adapted to Exabyte-scale needs of large science projects. It will be a common data infrastructure for Astro-particle, Radio-astronomy, Gravitational Waves, Cosmology and Particle Physics. ESCAPE DIOS is developed under FAIR (Findable, Accessible, Interoperable, Reusable) data management principles, to support the open science research. This environment allows the test of use-cases, with a view to understanding how the differing distributed data storage, access and manipulation use cases across these sciences might be supported.

The novelty in the ESCAPE DIOS framework is that is bringing together, for the first time, 9 flagship ESFRIs ESFRIs (ATLAS, CMS, CTA, EGO-VIRGO, FAIR, LOFAR, LSST, MAGIC and SKA) in Astro-particle Physics, Electromagnetic and Gravitational Wave Astronomy, Particle Physics, and Nuclear Physics, to work on a common scientific data management infrastructure and other day-to-day operational aspects. This breaks the historical sociological barrier among different scientific communities in search for commonalities in their computing models, with a view to successfully address the plans for challenging experiments in the next decade, especially in data management and computing. This represents a remarkable and milestone cooperation towards the implementation of data FAIRness standards and open access.

The test results from the ESFRIs were presented in the ESCAPE DIOS workshop. For a detailed info on the results of each individual assessment, more info is available in “D2.2 Assessment and analysis of performance of the first pilot data lake” and also in the dedicated presentations made during the workshop.

The promising results from the ESCAPE DIOS Full Dress Rehearsal

The importance of this ESCAPE DIOS workshop lies on the fact that the ESFRIs inputs will be essential for the development of the ESCAPE DIOS full prototype, whose development started in early 2021. The tests, assessments and analysis performed in the full dress rehearsal were very successful and a good prophecy is coming from the further development of the prototype. There was a strong emphasis on the features of the ESCAPE DIOS in context of the use case for each experiment, functionality of each ESCAPE DIOS feature, future directions, improvements and innovations. The results collected will allow ESCAPE DIOS team to meet foreseen future needs of ESFRIs represented in ESCAPE.

 

Left: SKA at Night. Credits & Copyright: SKA Organisation. Right: A depiction of what the completed LSST observatory will look like atop El Peñon summit, Chile. Credits:  LSST Project/NSF/AURA.

What were the encouraging results? From the astrophysics field, the CTA simulated a night data captured from a telescope in the Canary Islands for 6 hours. It ingested 500 datasets of 10 files, transferring successfully a total of 7 terabytes of data. SKA uploaded and moved data during 24 hours to simulate the data life cycle based on Quality of Service (QoS) classes, where 1 GB of data was involved. From particle physics, FAIR ingested and replicated one 1-GB file every 10 minutes during the whole rehearsal, processing total of 3 terabytes of data with success. Atlas Experiment, from High Energy Physics, tested ESCAPE DIOS reliability along with QoS functionality through continuous uploads, as well as QoS based replications, reaching a successful final result of 3 TB. Ego-VIRGO, dealing with gravitational waves, uploaded VIRGO public data sampled at at 4kHz from an EGO server to ESCAPE DIOS. Around 2TB of data was made available in real-time strain data to pipelines and tools assessing the data quality.

Towards the full prototype of a data infrastructure for Open Science

By early 2022, there will be the 3rd edition of ESCAPE DIOS workshop, for a last and final review of ESCAPE DIOS prototype performance (to be detailed in the future D2.3 report). ESCAPE DIOS is planned to be launched by June in the same year and will consider its deployment towards full production services within the EOSC.

READ THE “D2.2 Assessment and analysis of performance of the first pilot data lake” TO HAVE MORE DETAILED INFORMATION OF THE RESULTS FROM THE ESCAPE DIOS PILOT ASSESSMENT