Over the coming decade, Europe will face critical challenges in maintaining biodiversity, ensuring food security and combating pathogens. Our 2024–28 Programme will address these issues by mobilising and integrating molecular data, using successful coordination models from human genomics. Through strategic investments and collaboration in externally-funded projects, ELIXIR will enhance scientific services and support transnational research in these essential areas.
The following projects have been selected as part of the ELIXIR 2024–28 Programme’s Biodiversity, food security and pathogens Science Tier:
- E-PAN: Enhancing pan-genome analysis in plants
- FAIRyMAGs: Optimising Metagenomics Assembled Genomes building: workflow finalisation, training material development, real data evaluation and resource allocation tool creation
- HARVEST: Handling and alignment of plant research FAIRification – value through the use of ELIXIR data Standards and Tools
- Odyssey: Connecting molecular and geographical biodiversity data
With the declining cost of genome sequencing, the focus of plant researchers is shifting towards characterising the wide genomic diversity present within a species. Crop pan-genomes consist of the sequencing, comparison and integration of multiple different genomes from the same agriculturally important species such as wheat, rice and potatoes. Exploiting the information encoded within these pan-genomes can lead to the development of new cultivars more resilient to upcoming challenges like increased drought and heat stress.
Multiple consortia are independently generating and integrating these pan-genomes, but there is currently little progress in streamlining and homogenising these efforts. While sequence quality is no longer a major issue, the completeness of both assembly and subsequent gene annotation are much harder to correctly quantify, while being the major drivers in explaining the adaptive differences between genotypes. Where there are efforts to visualise and browse pan-genomes, for example by using graph representations, the easy retrieval of gene Presence Absence Variation information or structural rearrangements is currently lacking, hampering knowledge learning.
E-PAN aims to streamline the efforts of different research groups within the ELIXIR Plant Science Community. This encompasses the development of effective standards, computational pipelines and tutorials to assess the quality of pan-genomes and provide solutions to identified problems. We will also evaluate and integrate different approaches for data visualisation and browsing, which will be used by different partners sharing pan-genomics results. A one-day meeting and an online workshop will be organised to disseminate results and initiate new collaborative projects. These concerted efforts will lead to a standardised approach to be used in future pan-genome projects, a reduction in duplication efforts across consortia, and a set of tools to visualise and mine pan-genomics results.
Nodes involved: ELIXIR Belgium, ELIXIR Germany, ELIXIR Portugal, ELIXIR Slovenia, ELIXIR UK
Communities: Plant sciences
Metagenomics Assembled Genomes (MAGs) are crucial for understanding biodiversity, enhancing food security and combating pathogens by providing insight on uncultured and unexplored genomes. This proposal outlines a comprehensive project aimed at advancing metagenomics research through the advancement, optimisation, evaluation and dissemination of robust FAIR workflows for building MAGs.
Leveraging the Galaxy platform, our primary objectives include finalising a user-friendly state-of-the-art Galaxy workflow tailored for MAG construction, and ensuring its accessibility and reusability through integration with WorkflowHub. To support user adoption and proficiency, we will create FAIR educational materials hosted on the Galaxy Training Network (GTN), empowering researchers with the skills necessary to use the workflow effectively.
The efficacy of the developed workflow will be rigorously evaluated by analysing MAGs generated from simulated and real-world data-spanning diverse environments: atmosphere, marine and cow gut microbiomes. This evaluation will provide valuable insights into the workflow's performance and its applicability across different sample types, complexities and ecosystems.
We will also investigate the computational resources required for executing the assembly step of the workflow using data provided by several Galaxy servers and the MGnify team on various input datasets. The aim would be to optimise resource allocation to ensure efficient and cost-effective MAGs construction. A novel tool will be developed to facilitate this process, allowing researchers to accurately estimate and allocate resources for each step of the assembly pipeline.
By addressing these objectives, our project aims to accelerate metagenomics research by providing researchers with a comprehensive and accessible framework for MAGs construction. This framework will not only streamline the workflow for building MAGs but also facilitate reproducibility, collaboration and innovation within the ELIXIR Microbiome Community.
Nodes involved: ELIXIR France, ELIXIR Germany, ELIXIR Italy, EMBL-EBI
Communities: Galaxy, Microbiome
The standardisation and accessibility of plant data is a major challenge for agricultural research. MIAPPE, which was developed as part of the transPLANT and ELIXIR-EXCELERATE projects, has made a decisive contribution to unifying data capturing. Also, the FONDUE Implementation Study facilitated the integration of phenotypic and genotypic data.
Nevertheless, challenges persist in achieving full FAIRness of plant data. The development of guidelines and best practice documents within the Commissioned Service INCREASING has improved this. However, further enhancements are required, such as providing additional documentation and reference datasets.
To address these needs, it is important to assess the practical effort required to FAIRify datasets using MIAPPE, ISA, ARC and RO-Crate standards. The idea is to provide biologist-friendly data documentation and at the same time introduce machine-actionable formats for bioinformaticians to use. A further challenge arises from the scattered nature of the information, as there is no single resource on which all the information is collated.
In HARVEST, we aim to address these challenges by FAIRifying datasets (DROPS, AGENT) using the latest version of MIAPPE as a basis, which now covers more diverse and complex use cases. This process will include enriching the MIAPPE documentation in particular with example datasets, updating training material and refining mappings to other interoperable formats such as BrAPI, Bioschemas and ISA-Tab/JSON. We will also establish links using FAIDARE to repositories such as EMBL-EBI EVA, e!DAL-PGP, recherche.data.gouv and Zenodo, to enhance data sharing and reuse opportunities. An extension of the RDMkit Plant Sciences pages will be implemented to serve as a primary hub for information on FAIRification of plant data. Furthermore, we will be consolidating resources and improving accessibility through direct linking to the original web resources and recipes, also adding Jupyter notebooks to the FAIR Cookbook where possible.
Nodes involved: ELIXIR Germany, ELIXIR France, ELIXIR Netherlands, ELIXIR UK, EMBL-EBI
Communities: Plant Sciences
Understanding molecular biodiversity is essential for ecological conservation and sustainable development. While a vast array of molecular data awaits exploration, its lack of connectivity with other sources of data and metadata such as geographical reference, habitat, population size and phenotypic data often pose significant barriers to biodiversity research.
This project proposal is about developing Odyssey, a web portal in the form of a user-friendly interface that will allow researchers, educators and citizens to navigate the world of molecular biodiversity using Greece and Norway as case studies – two countries with a characteristic and unique wealth of biodiversity, representative for Mediterranean and Nordic types of ecosystems respectively.
Based on existing sources of information and prototype applications available for specific regions and taxa, this project aims to link actual efforts and develop a new interface to offer diverse functionalities for data exploration and analysis, such as descriptive statistics, graphs, maps, customisable data filters and dynamic visualisations. Through modular design, the application will ensure flexibility and scalability, enabling easy integration of new data sets and analytical tools in the future. This approach will be used for training and communication, inviting traditional biodiversity research groups to utilise new information concerning the spatial patterns of biodiversity and their connection with features that are important for designing conservation measures, such as habitat connectivity, representativity, population demographics, dynamics of adaptation and migration.
Odyssey’s outcome will be a valuable tool for studying and, ultimately, offering a basis for managing and conserving the rich molecular biodiversity of Greece and Norway, as well as supporting the activities of the ELIXIR Biodiversity Community in the two Nodes and in Europe. This will promote collaboration, innovation and knowledge exchange in biodiversity research and beyond.
This new tool will be developed and offered under an open-source licence, encouraging community participation and contribution to further enhance its capabilities and broaden its applications, fostering a robust network for biodiversity research in Greece and Norway.
Nodes involved: ELIXIR Greece, ELIXIR Norway
Communities: Biodiversity