Strengthen Data Management in Galaxy

This proposal focuses on the enhancement of Galaxy's data management features to provide additional provenance information and improve the integration of Galaxy in the existing data management ecosystem. We will leverage existing technologies and services in ELIXIR and complement ongoing international projects (ELIXIR-CONVERGE, the COVID-19 Data portal, EOSC Life, etc.) while building on national initiatives (German NFDI, ELIXIR Belgium strategy, UK BioFAIR, etc.).

Among the goals, we aim to make the Galaxy Data Libraries more scalable and further improve the reusability features of the platform by metadata enrichment. The Galaxy metadata system will be extended to enable the export of analysis records together with their provenance to relevant ELIXIR Core Data Resources and registries (e.g. WorkflowHub).

A strong emphasis will be on the integration of EGA, FAIRtracks, and the GA4GH Beacon network into Galaxy to support analyses of human data. Therefore, support for user-level encrypted data processing will also be added to allow for the analysis of sensitive data. To this end, we will include an encryption layer into the Pulsar network and enhance performance by increasing the data locality of distributed Galaxy analyses through a prototype data caching network.

These data management-related features and improvements aim to tackle concrete current worldwide needs, like the ones related to COVID-19 (meta-)analyses. The Galaxy Community has demonstrated the ability to sustain a fast rollout of novel fit-for-purpose features for the needs of European researchers, a trend we intend to continue with this proposal.

Duration
1st June 2021 - 31st May 2023