Research Data Management guidelines

Resources to help you with data management in life sciences, from general topic introductions to step-by-step instructions for solving specific problems. 

Resources in the ecosystem

  • Get an overview of data management, including advice specific to your domain, role and country
  • Get step-by-step recipes for accomplishing data management tasks
  • Explore the online tool that guides you through data management planning
  • Access life science data through ELIXIR's Core Data Resources
  • Find ELIXIR-recommended deposition databases to store and share experimental data
  • Use interoperability resources to connect data across systems and domains
  • Find training courses, webinars, training materials and workflows
  • Search for standards, databases and data policies by domain, species and country
  • Find bioinformatics software and connect it in workflows

User journeys

How you can use our resources to address common tasks in research data management.

Audience: Data stewards, researchers, trainers, PIs, policy makers, research software engineers.

Starting with Research Data Management (RDM) can look like an overwhelming task, but there are tools and personal expertise to support you. This user journey showcases the basic steps that get you started.

  • Begin with general guidance on data management, including the domain-specific and task-oriented topics available at RDMkit. RDMkit is an online portal providing best practice guidelines for data management, organised along several axes, including task, domain and user role.
  • From the RDMkit, follow the links to step-by-step instructions for your task in the FAIR Cookbook. You can also find extended information and related resources about the standards in FAIRsharing, linked from both the RDMkit and FAIR Cookbook.
  • On each RDMkit page, you can also find a link to TeSS, where you can find relevant training materials and courses. TeSS as well as RDMkit also link to the catalogue of computational tools for life sciences, bio.tools.

Audience: Data stewards, researchers, trainers, PIs, policy makers, research software engineers.

A data management plan (DMP) is used to structure and plan data management activities in your project. It comprises handling of data during and after the project. Many funding agencies have now made it compulsory to provide a data management plan with grant applications.

  • To start, consult the Data Management Plan (DMP) page at the RDMkit to help you get an overview about what entails a DMP. In addition, on the national pages on RDMkit you can find the detailed requirements for certain funders and institutions.
  • From the RDMkit you can choose a tool like the Data Stewardship Wizard (DSW) to create a DMP. The DSW is a questionnaire-style tool for creating, updating, and sharing your DMP. For creating the DMP you may want to choose a local DSW from the National resources table on the RDMkit Data Management Plan page or the DSW cloud for ELIXIR. The DSW also cross links with RDMkit’s guidelines, should you need them when answering questions. Also, the DSW imports information from FAIRsharing to help you choose among repositories, standards and policies for your project.
  • For further guidance, you can navigate from the RDMkit to TeSS to discover DMP-, DSW- and FAIRsharing-related training materials and events.

Audience: Data stewards, researchers, PIs.

  • The Data Sensitivity page in the RDMkit will give you an overview of what sensitive personal human data is and get general guidance. Then, go to the Data protection page to find out how to protect sensitive data from an ethical and legal standpoint.
  • At this stage, it is important to review local regulations and policies for your country, which you can read about in the National resources pages in the RDMkit. Please note that most of these are introductory resources and we would always advise seeking assistance from your Data Protection Officer (DPO) or data stewards with particular expertise with human data due to the complexity of the domain.
  • For a more technical perspective, follow the link from the RDMkit Data protection page to the FAIR Cookbook recipe to learn about how to represent data use conditions for sensitive data using controlled vocabularies that can be encoded into a machine-readable metadata format.

Audience: Data stewards, researchers, PIs, research software engineers..

  • The documentation and metadata page in RDMkit will help you to understand what you need to consider when managing the metadata for your project. To further explore the standards used in your domain as well as their connections, a link to FAIRsharing is provided.
  • Several relevant FAIR Cookbook recipes are linked from RDMkit that can help with the task. For example, you can follow instructions to create a data dictionary to define the metadata variables you need to capture - this recipe is accessible to most researchers and does not require a great level of technical knowledge.
  • If you want to go further, check out the recipe on how to use ontologies to describe your variables and values - this recipe requires some understanding on what ontologies are and how to use them. If you want to go beyond metadata standards and data dictionaries, a further recipe guides you through creating a metadata profile from scratch, if no existing standard fully meets your needs. The FAIR Cookbook recipes also link you back to RDMkit for broader context.

Audience: Data stewards, researchers, and everyone who wants to submit data to the BioImage Archive.

  • Start by checking the BioImage Archive website. Follow their online tutorial about submitting data to the BioImage Archive.
  • If you need additional practical guidance, search the FAIR CookBook for templates and scripts that can help with submission to BioImage Archive, such as this example on High-Content Screening data deposition.
  • From the “What to read next?” section in the FAIR CookBook, you can access the RDMkit Bioimaging data page, which provides general information and solutions for managing and submitting bioimaging data. RDMkit Tool assembly section might provide insights on tooling for managing bioimaging data, such as the OMERO tool assembly page.
  • Under “More information” in RDMkit, you can follow the link to the registry of training events and materials, TeSS, related to BioImage Archive, such as metadata guidelines for bioimaging data. You can use TeSS to find additional training by searching keywords of your choice (e.g. REMBI).

Audience: Data stewards, researchers, and everyone who wants to submit data to the BioImage Archive.

  • Data management guidance for trainers is available on RDMkit. For general information, guidelines and best practices about DMP, read the RDMkit Data management plan page. The DSW is one tool to facilitate data management planning in a systematic, questionnaire based approach. To shape the experience of your training audience you could set up a dedicated instance of DSW to customise different aspects. You might want to consult your local IT administrative team regarding IT security issues.
  • Equally important aspect of creating a DMP is to be familiar with institutional guidelines, and those can be learned about via national resources available on RDMkit.
  • You may want to explore how existing course materials address the topic of Data Management Planning e.g. through hands-on sessions, in a form of webinar, or a mix of theoretical and practical content. Following the link from RDMkit you can go to the Training eSupport System (TeSS), a training platform by ELIXIR, which provides a collection of past/recent material used for courses, webinars, and so on. Use the keyword “data management plan”. Further, you may look for „train the trainers“ on TeSS to get information on events and workshop’s materials to improve your training skills.

FAQs

RDMkit contains knowledge on general considerations on standards on metadata and documentation. You can find additional standards on RDMkit pages specific for your domain.

Moreover, FAIRsharing provides a curated list of standards for data formats and metadata. In terms of adding to the DMP, Data Stewardship Wizard (DSW) provides integration that allows searching for a standard from FAIRsharing via the DSW interface directly; then, the standard record is linked within the DMP.

There is an RDMkit page on RDM costs that explains different aspects that need to be taken into consideration when estimating this. Also, the DSW Storage Costs Evaluator (based on this calculation model) is a more detailed tool for estimating costs for data storage.

  • Check for national funder recommendations in the RDMkit national pages. If your national funder is not mentioned in the national page, contact the page coordinator.
  • You can find further guidance in the Data Management Planning task page.
  • Consult the documentation of your funding agency or institution, or contact them to figure out if they require or recommend a DMP template.
  • A core DMP template has been provided by Science Europe.
  • Read DMP guidelines from the Horizon Europe Programme Guide and the Horizon Europe Annotated Model Grant Agreement. The Horizon Europe DMP template can be downloaded from the Reference Documents page, by clicking on “Templates & forms”, “Project reporting templates” and then on “Data management plan (HE)”.
  • In case you provide services on data management planning to other people, you might want to consider adopting the DMP Common Standard model from the Research Data Alliance if you want to produce a machine-actionable DMPs template.

For general guidance on data preservation, take a look at Data Preservation page on RDMkit. Data preservation strategies should be covered in your project’s Data Management Plan.

Your institution, funder or country may also have additional rules you need to follow. If there is a Data Stewardship Wizard instance available in your country or institution, this may be a good source of information.

Find the repository that is usually used for the type of data you have using the EBI Data submissions wizard and search directly there.

The RDMkit Existing Data page contains more considerations and solutions to find datasets/databases like registries, indices and search engines. If you are searching for datasets linked to specific articles Europe PMC provides you with search functions for this.

FAIRSharing.org is a registry of standards and databases which can help you to find relevant repositories.

Before sharing any data, consider the type of data you are sharing and the legal and ethical implications. There are guidelines available in the RDMkit (Data Lifecycle > Sharing) providing recommendations for different levels of sharing. The FAIR Cookbook has several recipes covering the practicalities of data sharing, such as sharing via SFTP.

Before sharing any data, consider the type of data you are sharing and the legal and ethical implications. There are guidelines available in the RDMkit (Data Lifecycle > Sharing) providing recommendations for different levels of sharing. The FAIR Cookbook has several recipes covering the practicalities of data sharing, such as sharing via SFTP.

Removing personal data does not guarantee that your dataset is truly anonymised, as discussed in data sensitivity page in the RDMkit. The RDMkit has guidance on How to protect data under GDPR. In addition to consulting these resources, talk to your node or institution’s data stewardship team for guidance that takes into account any national legislation you need to be aware of.

Some types of non-human data, such as biodiversity data, may still qualify as “sensitive”. Take a look at the RDMkit page on data sensitivity for more information.

Do not be tempted to attach data as supplementary materials - deposit the data in a repository that is as open as possible and as closed as necessary! RDMkit provides guidance on how to pick the most appropriate repository for your domain.

FAIRsharing helps you search the databases in your domain and the FAIRsharing Assistant can guide you through this process. You can also search FAIRsharing for the data policy of the journal you are targeting to see what requirements they have.

RDMkit provides lightweight guidance on assessing the wider context of data management capabilities. For specific tools related to data assessments, start with the FAIR Evaluator tool (Recipe 1 in the Fair Cookbook) or the indicators on the FAIR Data Maturity website to locate yourself on the FAIR maturity spectrum.

Then, follow additional recipes from the FAIR Cookbook to move up the maturity ladder. For assessing software, take a look at FAIR Evaluator and the Software Management Wizard.

Additional resources can be found on FAIRsharing.org or FAIRassist, bio.tools and OpenEBench/tools list several tools and services dealing with FAIR evaluation of resources.

Do I have to create something of my own (like a dashboard where data is queryable using an SQL database) or is there something ready to use?

Submit Data to a Public repository

Metadata

Add appropriate and adequate metadata to your dataset as advised in the RDMkit Documentation and Metadata page.

ELIXIR FAIR assessment resources

Based on the nature of your data (e.g. bioinformatic tools, datasets, etc.) and execution type (i.e. manual or automatic), find your FAIRness assessment service within FAIRassist's extensive list. Some recommendations are:

Training resources on the introductions to FAIR principles:

Every field will have its own specific metadata needs. To get an idea on your field specific requirements, you can consult RDMkit and select “Your domain”. The “Your domain” section will give you an overview of where you can deposit your data, how your metadata should be structured, which tools can be used for this etc.

  • Repository specific: go to the database you want to deposit your data in and check their metadata requirements. Even within the same field, there might be small differences.
  • There is a study that compared all available metadata schemes.
  • FAIRsharing provides a registry for databases and the respective connected minimal metadata standards.

It is a good start, but not enough. You can find ways to improve this by reading the following pages on RDMkit:

There is no specific software for documenting data. RDMkit provides some suggestions in its guidance on capturing/documenting the data. The page also discusses adding appropriate metadata.

You don’t have to code to use ontologies. Check if there are recommended ontologies that you should use and your purpose of using ontologies, as described in RDMkit. One way to identify which ontology should be used is by checking the domain specific standards in FAIRsharing, for example by using the FAIRsharing Assistant.

The RDMkit provides a consideration on different types of persistent identifiers.

For software, the easiest solution is to generate a persistent identifier for your Github repository through Zenodo. For web resources, you can review the FAIR Cookbook recipe for creating resolvable identifiers.

The answer to this question is highly dependent on the exact DMP requirements of the project and/or funder. The RDMkit Planning page provides some further guidance.

When it comes to finding guidance on writing a Data Management Plan (DMP) for software (actually you probably are asking about software management planning - SMP), ELIXIR Europe provides a range of resources that can be of great assistance:

  • Software Best Practices Group (4OSS): ELIXIR's Software Best Practices Group, also known as 4OSS (Open Source Software), is a community-driven initiative aimed at promoting and advancing best practices in software development across the life sciences. The group offers guidance, resources, and tools for developing high-quality software, which can be incorporated into your DMP to ensure the software's long-term sustainability and usability.
  • ELIXIR Software Management Plan (SMP): The ELIXIR SMP is a comprehensive guide specifically designed to assist researchers in managing their software projects effectively. It provides detailed information on various aspects of software development, including version control, documentation, licensing, and sustainability. The SMP can serve as a valuable resource when crafting your DMP for software.
  • Software Management Wizard: It is an interactive online tool designed to help researchers navigate the process of managing software projects. It provides step-by-step guidance and generates customized recommendations based on your specific project requirements. The Software Management Wizard can assist you in identifying the key elements to include in your DMP for software and ensure compliance with best practices.