Dwingeloo, the Netherlands
JIVE is a European institute that provides support, conducts leading research and forwards technical development in the field of radio astronomy. It is the central organisation in the European VLBI Network (EVN), which is a network of radio telescopes across the globe that can be coordinated to observe cosmic radio sources. Proposals to observe with the EVN are open to any scientist and are evaluated on scientific merit. Data collected during these observations is collated centrally at a super computer, known as the correlator, located at JIVE.
Single radio telescopes are often not powerful enough alone to observe cosmic radio sources in extensive detail. In such instances, it is necessary to coordinate multiple telescopes across the globe to increase the resolving potential by effectively turning the network into a single instrument. This technique is known as Very Long Baseline Interferometry (VLBI).
Coordinating a global network presents technical challenges that are addressed at JIVE. Overcoming such challenges is key to obtaining results that have the potential to radically change the way we perceive the universe and to improve our understanding of its fundamental laws.
See JIVE jupyterhub at https://jupyterhub.jive.eu
The ESCAPE project provides an opportunity for JIVE to further develop and modify existing tools that allow researchers to work with large data sets. This is vital as the amount of data collected about the universe continues to increase, particularly as new radio telescopes are built across the globe. The project ensures sustainable and efficient handling of data for users across the astronomy community and beyond.
JIVE is an institute whose main mission is to support the European VLBI Network and its users in the broadest sense; these developments allows JIVE to go beyond providing expert knowledge or software but also provide resources for our users to do their analysis on.
The collection of DOI citable software packages (ESCAPE OSSR: versioned binaries uploaded to ZENODO), in combination with enhanced findability of data because of more archives opened up via the ESCAPE VO and a platform which links the software and data and deployment of the analysis software to distributed compute/storage resources (ESCAPE ESAP) forms a complete system, not unlike envisioned as EOSC, allowing (new) users to easily find and process data from a wild variety of instruments or research infrastructures.
Thanks to ESCAPE, JIVE was able to:
JIVE built an infrastructure that allows others to do their science more easily by providing compute+network+storage hardware (in our Jupyterhub) and a more accessible search interface we were able to significantly lower the barrier to entry into VLBI data reduction. Data set sizes have increased and not all users have access to sufficient compute+network+storage to handle today's data sets. Being able to provide these resources is a big step towards more inclusive science since it allows those with less-endowed systems to do their science.
All these developments fit together better than originally planned and have resulted in a complete solution that lowers the barrier to finding + processing VLBI data by non-VLBI experts.
The goals of the ESCAPE OSSR includes also the provision of a DOI citable list of stable software products with links to documentation and maintainers. This allows the enhancement of existing software packages, which is well-aligned with the problems JIVE tried to address.
The ESCAPE OSSR helped JIVE in addressing reproducibility of science results and enabling a whole new interface for doing data reduction. Thanks to ESCAPE OSSR, there is a citable software in an actively maintained repository. Entry into the repository means certain guarantees (licensing, documentation, version control) all were checked and according to OSSR specifications.
The (generic) radio-astronomical data reduction package (CASA) was extended to allow it to be run as Jupyter kernel. Several of the (generic) data reduction tasks required fixes for VLBI data processing and new ones sometimes needed to be created. A DOI-citable Jupyterkernel-enabled package, stored on Zenodo, provided a known-good and stable execution environment for users wishing to to radio-astronomical data processing, including VLBI.
The ESCAPE VO is an excellent example of interoperability. Adding a radio-astronomical archive to the collection of VO-findable data sets makes the EVN data sets immediately findable by scientists worldwide. Frankly, the VO is an amazing example of how interoperable tools should work.
JIVE has hosted the European VLBI Network archive for over two decades now but findability of EVN data could be improved a lot.
Currently, there is a self-developed limited in functionality search utility available but it is slow, only operable as webform (not machine actionable), and does not necessary return all results. Downloading the (very) large VLBI data sets is also a manual process. Making the entries in the EVN findable using the VO Astronomical Data Query Language and downloadable using the DataLink protocol has been long thought of and now, thanks to ESCAPE funding, it is possible.
VLBI data does not fit in the VO model perfectly. Together with a group of international experts, changes for the VO standard were discussed to allow it to describe VLBI data. Following that, code was written to read the data files in the EVN Archive to harvest and compute the required meta data the VO protocol(s) require. Finally, using ESCAPE funding, hardware was purchased, installed and configured to run a VO ADQL query server to handle search requests and expose the VO ObsCore and DataLink services.
It is now a lot easier to analyse JIVE own archive! The ADQL language allows very flexible filtering/reporting that was not possible before. The EVN Archive website will soon start to use this search utility too and the Jupyterlab plugin (developed in ESAP) uses both the ADQL query functionality as well as the DataLink protocol to find and download datasets fom the EVN Archive. Access to the EVN Archive can now be scripted - i.e. be executed by machines too.
When it comes to finding, linking, and downloading data from a diversity of instruments there is no alternative but the VO, and it is a good alternative. There is no need to have another alternative - observatories should strive to make their collections available through the VO.
The ESCAPE ESAP service is the high-level interface where software, data, and processing come together. As such it was for JIVE the most logical place where to address the high-level linking of the individually developed components. The ESCAPE ESAP makes finding data from a variety of instruments or experiments and processing much easier
VLBI data reduction is non-trivial. JIVE has tried to address the common data reduction and reproducibility of scientific analysis of VLBI data by creating a Jupyter notebook environment containing the CASA (generic) radio-astronomical data processing and analysis package, extend its VLBI functionality (work done in ESCAPE OSSR), capture the state-of-the-art basic calibration steps and parameters in a (version controlled) Jupyternotebook. Using JIVE platform (JIVE is working on supporting eduGAIN federated logins at this time) users can perform calibration of their data sets and publish the (version controlled) notebook. This notebook, together with the OSSR entry, should provide repeatable data analysis.
Another topic to address is findability of VLBI data: this was addressed with ESCAPE VO team, and within ESCAPE ESAP a Jupyterlab plugin was developed that exploits this new search interface and allows one-click creation of a ready-to-run Jupyternotebook for a particular search result.
With funding from ESCAPE it was possible to purchase hardware to configure and run a Jupyterhub service. The rest of the time was spent on creating and debugging the Jupyternotebook, developing the search plugin, enabling federated logins. To the latter: these are JIVE's first steps towards using an (external) federated login service, which required a lot of learning and entering into contracts with the Dutch NREN SURF.
JIVE has *significantly* lowered the barrier to entry for European VLBI Network data reduction by addressing findability of the data, how to calibrate it, and provide compute and storage to execute this on.
The software is only experimental at the moment. As users will start to exercise the system requests for more functionality are expected; for example, a BinderHub installation could be a next target. This seems a promising technology that is not funded/budgeted within ESCAPE. Furthermore, it is expected that remaining common issues between RIs could benefit from continuing the developments started in ESCAPE. For this reason, JIV-ERIC is a signatory on the ESCAPE collaboration agreement.
JIVE also would like to be able to work on providing a DOI infrastructure for the published Jupyternotebook(s) but that was not covered by the ESCAPE project and will have to wait for future funding.
These are the first steps towards an open science cloud; there is more work to do to iron out obstacles that prevent a truly seamless data finding-and-reduction experience for (EOSC) users.