The goal of this WP is to provide a toolkit for researchers to enable lifecycle management for their research data, in accordance with international standards. We will identify a suite of recommended tools to locally manage data as well as metadata, which are integrated with the recommended ELIXIR Core Resources and Deposition Databases (Task 1.2), and we will smooth the way to their wide adoption and sustainable co-production.
Mindful that each country, or institution, may have their own generic e-infrastructure, e.g. facilities for data sharing and computing, as well as other local platforms and constraints, we need to support variations from Node to Node. However, we aim to drive interoperability (e.g. via standardised interfaces and validation tools, like BrAPI for plants) and nurture existing or growing consensus around usage of specific platforms (e.g. DMPonline, FAIRDOM, Galaxy, Jupyter, etc).
We will do this by defining a data management toolkit “blueprint” that consists of a set of interconnected components and their interfaces. Components will include:
- Wizards/templates supporting writing of data management plans,
- Systems supporting capture of metadata at the point of sample collection, data generation, data processing, model building step – supporting FAIR at source data sharing,
- Support for sharing of data and metadata within projects and with the academic community as well as general public,
- Linked data analysis and workflows supporting provenance.
We have organised this WP with three tasks that identify, assemble, update and provide access to the toolkit (with individual experts from our Nodes) and support the WP demonstrators. In addition, Task 3.4 interfaces with WP2 to develop and deliver training and capacity building on the toolkit and its single-entry access portal.
Objectives
O3.1 | Increase Node interoperability through use of a harmonised common data management toolkit (with WP1) | Task 3.1 |
O3.2 | Establish national capacity in data stewardship in using the toolkit as well as updating, extending and sustaining the toolkits (with WP2) | Task 3.2 and 3.4 |
O3.3 | Through a single entry-point for researchers, institutes, projects as well as funders, we will enable discovery of appropriate components | Task 3.3 |
O3.4 | Demonstrate best practice and exemplar toolkit configurations through demonstrator projects (with WP1 and WP5) | Task 3.2 |
Tasks
Task 3.1 Establish a Starter Toolkit
This task will assemble a “Common Toolkit Blueprint” and example implementation configuration of the toolkit, using already available tools. Work will be based on the emerging consensus on toolkits in use in national Nodes (in collaboration with WP1).
To facilitate integration in the diverse national contexts, emphasis will be placed on identifying and acting on interaction points (interfaces) and the necessary standardisation between tools. Specifically, we will focus on an omics oriented blueprint, centered around FAIRDOM and Galaxy, and an instance targeting sensitive human data. These starter toolkits, based on existing practices in the Nodes, will allow ELIXIR-CONVERGE to deliver tangible solutions.
Due to national priorities and infrastructure choices, we will provide Node-specific instances of these blueprints, focussing on lightweight, sustainable integration practices (API-based) and adoption of shared templates and markups (e.g. Bioschemas). This will drive interoperability between Nodes as well as facilitate capacity building and accelerated deployment of well-established, proven solutions that are integrated in the ELIXIR ecosystem of Core Data Resources, Deposition Databases, as well as analysis technologies and platforms established in EOSC-Life (e.g. Galaxy and BioContainers).
Leadership: University of Manchester (Carole Goble, ELIXIR UK)
Participants: BSC (ELIXIR Spain), CSC (ELIXIR Finland), Heidelberg Institute for Theoretical Studies (ELIXIR Germany), SIB (ELIXIR Switzerland), University of Bergen (ELIXIR Norway), University of Luxembourg (ELIXIR Luxembourg), University of Tartu (ELIXIR Estonia), ÚOCHB (ELIXIR Czech Republic), Uppsala University (ELIXIR Sweden), VIB (ELIXIR Belgium).
Task 3.2 Processes for enriching, maintaining and sustaining the Toolkit
In this task we will define processes to expand and update the toolkit. Based on the selected demonstrators (WP5), we will identify tools available to and/or provided by Nodes, evaluate them for broader usage, and assess their potential to integrate into the toolkit. Selection processes and guidelines will be developed, informed by experiences of selecting Recommended Interoperability Resources and Node data management best practices. This will be done in close collaboration with the Expert Network (WP1).
An emphasis will be on the identification of domain specific tools (e.g. BrAPI) and on guidelines and best practices for tool integration (Task 3.4) to facilitate and stimulate integration in the established ecosystem. This task, in collaboration with Task 5.2, will deliver best practice and exemplar toolkit configurations for investigator-centric data management, integrated in the portal (Task 3.3). The tools that are its components will be regularly reviewed.
Leadership: DTL-Projects (Celia van Gelder, ELIXIR Netherlands)
Participants: University of Bergen (Inge Jonassen, ELIXIR Norway ) ATHENA-RIC (ELIXIR Greece), BSC (ELIXIR Spain), CNRS (ELIXIR France), DTU (ELIXIR Denmark), EMBL-EBI, Heidelberg Institute for Theoretical Studies (ELIXIR Germany), Hungarian Academy of Sciences (ELIXIR Hungary), INESC-ID (ELIXIR Portugal), University of Ljubljana (ELIXIR Slovenia), University of Manchester (ELIXIR UK), University of Tartu (ELIXIR Estonia), ÚOCHB (ELIXIR Czech Republic), Uppsala University (ELIXIR Sweden), VIB (ELIXIR Belgium), Weizmann Institute of Science (ELIXIR Israel).
Task 3.3 Matching tools and users: access portal integration
A toolkit should not be presented as a flat list of things researchers can use, since it is likely that the researchers who have the most need for a tool are not aware that a (generic) solution exists. A better approach is to present a toolkit in the form of the Data Stewardship Wizard (https://ds-wizard.org) which is part of the ELIXIR portfolio. The wizard guides the researcher through the development of the data management plan and can refer to all other components of the toolkit where the project and project planning calls for it.
Through integrations, building on established registries such as bio.tools (tools and services), FAIRsharing (standards, databases and policies) and TeSS (training materials), we will establish a resource that provides each user with a unique view on this whole ecosystem, allowing all stakeholders not only to discover the components, but also to identify an operational, interlinked trajectory tailored for specific use cases. This approach will promote best practices and the dissemination of the blueprints developed in Task 3.1 and 3.2.
Leadership: DTL-projects (Rob Hooft, ELIXIR Netherlands)
Participants:Heidelberg Institute for Theoretical Studies (ELIXIR Germany), SIB (ELIXIR Switzerland), University of Bergen (ELIXIR Norway), ÚOCHB (ELIXIR Czech Republic), Uppsala University (ELIXIR Sweden), VIB (ELIXIR Belgium)
Task 3.4 Best practices and training
To disseminate and build capacity to use and implement the Data Management Toolkit, we will develop best practices and training materials. We will organise interactive training that helps disseminate and advance best practices, creates new stewardship and training capacity, and feeds back into the selection, as well as the development process for the toolkit (Task 3.1 and 3.2). This will be developed as workshops, hackathons and online events. These will be delivered in close collaboration with the training experts, adhering to the Training co-production model (WP2). We will target the different stakeholders: data experts (WP1), researchers as well as support staff of data generating facilities. These activities will be crucial to instill 'FAIR data at source' throughout the data life cycle, but will also provide the trainees the skills and knowledge to retroactively 'fairify' existing datasets.
The increased stewardship capacity using the portal (Task 3.3) and toolkits, and development capacity for updating, extending and sustaining the toolkits (with WP2), will ensure the long term sustainability of the developed toolkit and procedures beyond ELIXIR-CONVERGE.
Leadership: University of Luxembourg (Pinar Alper, ELIXIR Luxembourg)
Participants: ATHENA-RIC (ELIXIR Greece), CNR (ELIXIR Italy), CNRS (ELIXIR France), DTL-Projects (ELIXIR Netherlands), DTU (ELIXIR Denmark), EMBL-EBI, Heidelberg Institute for Theoretical Studies (ELIXIR Germany), Hungarian Academy of Sciences (ELIXIR Hungary), INESC-ID (ELIXIR Portugal), SIB (ELIXIR Switzerland), University of Bergen (ELIXIR Norway), University of Ljubljana (ELIXIR Slovenia), University of Tartu (ELIXIR Estonia), ÚOCHB (ELIXIR Czech Republic), VIB (ELIXIR Belgium), Weizmann Institute of Science (ELIXIR Israel).
Deliverables
D3.1 | Assembled starter toolkit Toolkit blueprints for -omics and sensitive data assembled. |
July 2020 |
D3.2 | Toolkit extension based on first wave demonstrators Apply processes to extend the toolkit addressing additional communities. |
July 2021 |
D3.3 | Single entry-point portal for the toolkit for all stakeholders Components of the toolkit integrated into the portal. |
July 2022 |
D3.4 | Dissemination of the toolkit across the European Research Area Training and capacity building of all stakeholders. |
November 2022 |
D3.5 | Extended toolkit Harmonised toolkit including expansion with second wave of demonstrators. |
January 2023 |