ELIXIR has co-released a series of software demos that showcase real-world computational interoperability across the international genomics community, together with DNAstack, Terra and SevenBridges. Large scale genomic sequencing initiatives around the globe are beginning to generate tens of millions of genome sequences for research and healthcare purposes. These data hold great promise for the future of human health and medicine — but only if they can be responsibly shared and accessed across traditional boundaries.
ELIXIR is helping to enable this global ecosystem of genomics data and services by participating in the Global Alliance for Genomics and Health (GA4GH), the international standards body for genomics. The co-release of the 2020 GA4GH Connection Demos represents the first time this community has shown real-world interoperability using standards implemented at disparate institutions to search for, access, and analyse genomics data from around the world.
‘Working on this demonstrator has been a fascinating and rewarding challenge, as it required us to bundle our resources to assemble the various GA4GH Cloud API implementations. It has been a hard job to combine our work for the last two years into a single, coordinated and highly interoperable service stack. Doing this in the knowledge that other ecosystems in the GA4GH universe were doing the same, at the same time, was extremely motivating’, said Alexander Kanitz, of ELIXIR Switzerland. He is co-leading, together with Jonathan Tedds (ELIXIR Compute Platform Coordinator) and Shubham Kapoor (ELIXIR Finland), the ELIXIR Cloud & AAI ecosystem, which spearheads the implementation of a GA4GH federated ELIXIR Cloud infrastructure.
The three 2020 Connection Demos
- Horizontal Connection Demo: To emphasise the progress of GA4GH in the real world, these demos show reproducibility and portability of analysis between multiple implementers. A GWAS analysis of 1000 Genomes data is performed in multiple systems—including DNAstack, Terra (Broad Institute/Verily), ELIXIR, and Seven Bridges—implementing GA4GH APIs.
- Vertical Connection Demo: Originally rolled out in 2019, this is a demonstration of multiple GA4GH standards working together in a single workflow within one institution. The demonstration includes implementations of the Workflow Execution Service (WES), Data Repository Service (DRS), Passports, and Search, all implemented at DNAstack using both the Google Cloud Platform (GCP) and Amazon Web Services (AWS). In 2020, the team improved DRS support in the demo, updated to the newest draft of the GA4GH Search specification, and added multi-cloud support in the form of controlled access workflow inputs across GCP and AWS.
- Cross-Platform Connection Demo: Finally, driven by researcher need in pediatric cancer and other diseases, the FASP team has begun work on example scripts to explore how a researcher might orchestrate the GA4GH components provided by many different institutions as needed to aggregate data for analysis. Scripts use implementations of multiple GA4GH standards and data from multiple GA4GH Driver Projects and organisations—including the National Cancer Institute and National Heart, Lung and Blood Institute and European Bioinformatics Institute, and the National Center for Bioinformatics systems. Additionally, this initiative provides a social and technical framework for engaging additional data and tool providers around the globe in 2021.
During the preparation for the demo, the ELIXIR Cloud & AAI ecosystem run a workflow utilizing all four GA4GH Cloud APIs, including the yet-to-be-approved Task Execution Service (TES), the first GA4GH-powered stack to ever do so.
‘TES plays a key role in our plans to achieve a federated ELIXIR Cloud, bringing us closer to the vision of bringing analysis to the data, on a per-task basis’, said Alexander Kanitz.
‘The Connection Demos are an enormous success for the members of the GA4GH Work Streams, who have collectively dedicated thousands of hours over the last three years toward standards development’, said Ewan Birney, Deputy Director-General of the European Molecular Biology Laboratory (EMBL), Director of EMBL’s European Bioinformatics Institute (EMBL-EBI), and Chair of GA4GH. ‘The demos show how this community’s work will enable interoperability across the genomics endeavour.’