ELIXIR support for biodiversity research

ELIXIR Nodes provide services that you can use to support biodiversity research. These include: analysing, annotating and archiving DNA sequence and other molecular data; finding and annotating biodiversity-relevant data; linking and integrating data sets, and many others. The page below provides further information, it will be updated frequently, any questions please contact Physilia Chua (physilia.chua [at] elixir-europe.org).

Archive DNA data in the correct repository

Sequence-based approaches to address biodiversity questions are now widely used and diverse. They can range from the generation of full annotated genomes to short sequence markers. These data can be used to address questions around taxonomies, species occurrence, dietary make-up, species abundance, and many others.

As an open data infrastructure, we encourage you to deposit all raw and consensus DNA sequence data in the European Nucleotide Archive (ENA). If you are not sure how to use the ENA then take the free online course. It takes about 30 minutes to complete.

Wider assitance and guidance to help with data management, can also be found in the ELIXIR RDM Kit, an online guide containing good data management practices applicable to research projects from the beginning to the end.

Access and retrieve data relevant to biodiversity

ELIXIR provides a range of data resources that allow scientists to access and retrieve Biodiversity  relevant data of various types:

  • European Nucleotide Archive (ENA): the ENA provides a comprehensive record of the world’s nucleotide sequencing information. It covers raw sequencing data, sequence assembly information and functional annotation. For instance. search by species name to find all sequences for a particular organism. To learn more about how to use the ENA, take their free online course
  • UniProtKB/Swiss-Prot: search UniProtKB/Swiss-Prot to find protein sequences for a particular organism. 
  • MGnify: search MGNIFY using a range of environmental and biodiversity-related variables to retrieve relevant taxonomic, occurrence and abundance measures for a very wide variety of microbial species. 
  • EuropePMC: use EuropePMC to search and annotate relevant literature by organism name or other biodiversity-relevant search terms. The search results include include full text access to publications where available. 
  • SILVA: use SILVA to find and retrieve curated ribosomal RNA sequence data based on the dataset of millions of ribosomal sequences. The data includes genus-level taxonomic classification. 
  • ITSOneDB: use ITSoneDB to find and retrieve curated ribosomal RNA ITS-1 sequence data with species identity, based on the carefully assembled dataset of millions of ITS-1 sequences. 
  • Marine Metagenomics Portal: search the Marine Metagenomics Portal (MMP) to identify richly annotated and manually curated contextual (metadata) and sequences in connection with the biodiversity of the marine environment. 
  • Ocean Gene Atlas (OGA): search the OGA to discover sequencing-derived species distribution and abundance maps for marine organisms. 
  • BacDive: search BacDive to find manually curated knowledge about bacterial and archaeal biodiversity including taxonomy, physiology, morphology, molecular biology, isolation sources. 

Make your data easier to find and share (FAIR)

ELIXIR has an extenisve set of services that can help in making Biodiversity data sets more FAIR, including:

  • ELIXIR Recommended Interoperability Reources (RIR) help to make your biodiversity data Findable, Accessible, Interoperable and Reusable (FAIR). 
  • FAIRsharing: find data and metadata standards, inter-related to databases and data policies relevant to biodiversity. 
  • Bioschemas: you can use Bioschemas via schema.org markup on your websites so that they are indexable by search engines and other services to make their data more findable.

Find software and workflows to analyse your data

ELIXIR provides a diverse range of Services that can assist in the analysis of Biodiversity data:

  • bio.tools: find biodiversity-specific software and analysis tools via bio.Tools, the ELIXIR tools registry. 
  • Galaxy: browse to find workflows that enable biodiversity-relevant data processing and analysis including genome assembly, genome comparison and many other techniques.
  • MGNIFY: access MGNIFY for an automated pipeline for the analysis and archiving of microbiome metagenomic data. This can help determine the taxonomic diversity and functional and metabolic potential of environmental samples. 

Find computing resources to help you analyse datasets

ELIXIR Nodes run computing services that can be accessed by research projects. Many additional computing resources have been made available to support a range of research projects and a number offer access to Docker Orchestrators including Mesos and OpenStack access, Kubernetes/OKD and potentially GPUs where needed - for assistance please contact jonathan.tedds [at] elixir-europe.org, ELIXIR’s Compute Platform Coordinator. Specific examples of compute resources include:

  • de.NBI cloud (ELIXIR Germany) provides access for projects relating to biodiversity.
  • CSC (ELIXIR Finland) cloud services.
  • e-INFRA CZ (ELIXIR Czech Republic) offers supercomputer resources, storage services and distributed compute resources.
  • The European Galaxy server is an open, web-based platform for data intensive research and provides access to compute and storage resources, more than 2,500 different scientific tools, training materials and workflows to guide users.
  • EMBASSY Cloud resources is contributed by EMBL-EBI, as detailed on the European Open Science Cloud, EOSC Marketplace.
  • ExPASy SIB Portal from SIB (ELIXIR Switzerland) provides a ready-to-use slurm workload manager with a scientific software stack.
  • High performance compute and cloud resources provided by IFB (ELIXIR France), includes a federated set of national and regional servers.

Find training materials to help you get started

Use the ELIXIR Training Portal TeSS to find training courses and materials for hundreds of Bioinformatics Tools and Services.

Contribute to ELIXIR’s Biodiversity work

Join the ELIXIR Biodiversity Community.

Additional key Services, not part of the ELIXIR Infrastructure

The work of the ELIXIR Biodiversity Focus Group has identified a number of key services that are highlighted below, which are not currently part of the ELIXIR infrastructure, but which are highly relevant:

COPO - COPO is a data brokering service to help describe, store and retrieve genomic  data more easily, using community standards and public repositories.  For instance, DNA sequence data can be deposited in the ENA more easily using the standards and processes set out by COPO.

GlobalFungi - global repository of fungal metagenomic data obtained by next-generation-sequencing shared through a web based interface that allows various queries of the database and visualization of the results. The database covers data from all terrestrial habitats except those subject to experimental manipulation, containing information on fungal communities from soil, litter, dead plant material, living plant tissues and others.

Treatmentbank - Search the Plazi resource Treatmentbank using fulltext search or taxonomic names, bibliographic records or observation records to retrieve rich data about described species. 

Biodiversity Literature Repository is a community in Zenodo providing FAIR data liberated from taxonomic publications, that is taxonomic treatments, figures and annotated deposits of publications. It is the repository of the TreatmentBank service.

SIBiLS - Triage the literature with SIBiLS, which has pre-annotated the literature (MEDLINE, PMC, Treatmentbank, Allen AI pre-prints, ...) with a broad range of ontologies (taxonomic names, biotic interactions, ...). SIBiLS is a back office curation-support service and several of its annotations are mirrored into EuropePMC.

Other European-based biodiversity-relevant infrastructures

ELIXIR is part of a much wider network of Infrastructures dedicated to Biodiversity and with whom ELIXIR collaborate.  

  • DiSSCo: The Distributed System of Scientific Collections is a new Research Infrastructure (RI) for natural science collections. The DiSSCo RI works towards digitally unifying all European natural science assets under common curation, access, policies and practices, and aims to ensure that the data is easily Findable, Accessible, Interoperable and Reusable (FAIR).
  • EMBRC-ERIC: EMBRC is a pan-European Research Infrastructure for marine biology and ecology research. With its services, it aims to answer fundamental questions regarding the health of oceanic ecosystems.
  • Lifewatch-ERIC: LifeWatch ERIC seeks to understand the complex interactions between species and the environment, taking advantage of High-Performance, Grid and Big Data computing systems, and the development of advanced modelling tools to implement management measures aimed at preserving life on Earth.
  • Catalogue of Life: Catalogue of Life has the aim is to collate the names of all species set in the context of a taxonomic hierarchy and of their distribution.
  • CETAF: CETAF is the Consortium of European Taxonomic Facilities, a European network of Natural Science Museums, Natural History Museums, Botanical Gardens and Biodiversity Research Centres with their associated biological collections and research expertise.
  • GBIF: The Global Biodiversity Information Facility is an international network and research infrastructure funded by the world's governments and aimed at providing anyone, anywhere, open access to data about all types of life on Earth.
  • MIRRI-ERIC: The pan-European distributed Research Infrastructure for the preservation, systematic investigation, provision and valorisation of microbial resources and biodiversity. It offers its users a single point of access to the broadest range of high-quality microorganisms, their derivatives, associated data and services.
  • OBIS: The Ocean Biodiversity Information System is a comprehensive gateway to the world’s ocean biodiversity and biogeographic data and information required to address pressing coastal and world ocean concerns.