Galaxy-ELIXIR webinar series: FAIR data and Open Infrastructures to tackle the COVID-19 pandemic

Thu 30 April 2020, 16:00 to Thu 28 May 2020, 17:00

Galaxy project logoThe Galaxy Community and ELIXIR organise a webinar series to demonstrate how open software and public research infrastructures can be used in analysing and publishing SARS-CoV2 data.

In a series of five webinar sessions, experts from ELIXIR and the Galaxy community in the US and Europe will demonstrate how open access and open science are fundamental for fast and efficient response to public health crises. The focus will be on research reproducibility and transparency, using exclusively open source tools and the Galaxy platform.

The goal of the series is to demonstrate publicly accessible infrastructure and workflows for SARS-CoV-2 data analyses. The webinar sessions will guide participants step-by-step through setting up and executing the SARS-CoV-2 data analyses workflows developed by the global Galaxy community. After completing the series, participants will be able to fully reproduce the workflows and conduct their own analyses of SARS-CoV-2 data.

The webinar series starts on 30 April 2020 with the first introductory session. Subsequent sessions take place in weekly intervals.

More information about Galaxy analyses of COVID-19 data:

Programme - upcoming sessions

Session 5: Behind the scenes: Global Open Infrastructures at work 

28 May 2020, 17.30-18.30 CEST (starts at 16.30 BST, 11.30 EDT, 8.30 PDT)

This session will guide the participants how they can use the Galaxy compute capacities to run their own analysis. It will present the Pulsar network that connects data centres and High Performance Computing clusters to share their computation power in support of the Galaxy Europe users and provide examples of how to submit an analysis job from the user’s perspective.


  • Gianmauro Cuccuru, University of Freiburg, Germany, member of the European Galaxy team
  • Marco Antonio Tangaro, The Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Bari, Italy
  • Simon Gladman, University of Melbourne, Australia
  • Nate Coraor, Pennsylvania State University, USA

Past sessions

Session 1: Introduction to Galaxy and the Galaxy workflows for SARS-CoV-2 data analysis

30 April 2020, 17.00-18.00 CEST (starts at 16.00 BST, 11.00 EDT, 8.00 PDT)

The first session introduced the Galaxy platform and other public research infrastructure to be used throughout the webinar series. It also explained the motivation behind the Galaxy COVID-19 projects and explained the benefits of open reproducible research and transparent and interoperable analytics.


Session 2: Genomics/Variant Calling

7 May 2020, 17.00-18.00 CEST (starts at 16.00 BST, 11.00 EDT, 8.00 PDT)

The second session will present the initial analysis of the SARS-CoV-2 genome, published on bioRxiv. It will guide the participants through accessing and collecting the available datasets, the genome assembly and the analysis of the  within-sample sequence variants. It will also explain how to deploy on a Galaxy instance all the tools and workflows needed to reproduce the analysis.


Session 3: Cheminformatics: Screening of the main protease

14 May 2020, 17.00-18.00 CEST (starts at 16.00 BST, 11.00 EDT, 8.00 PDT)

This session presented the Galaxy workflow to identify candidate molecules for COVID-19 drug treatment, using molecular docking simulation of the SARS-CoV-2 main protease. These simulations are used to predict the binding positions of the candidate molecules in the protease binding site, score the quality of each pose, and compare the results with experimental crystallographic data.

The computationally intensive workflow was executed through a distributed compute network available via the Galaxy Europe platform. The webinar will present methods and workflows for the identification of potential COVID-19 drug candidates. Special emphasis will be given to the complex methods that have been applied and that have consumed more than 25 years of CPU and GPU time.


Session 4: Evolution of the Virus

20 May 2020, 17.00-18.00 CEST (starts at 16.00 BST, 11.00 EDT, 8.00 PDT)


  • Sergei Pond, Professor of Biology, Institute for Genomics and Evolutionary Medicine, Temple University, US


The analyses have been performed using the Galaxy platform and open source tools from BioConda. Tools were run using XSEDE resources maintained by the Texas Advanced Computing Center (TACC ), Pittsburgh Supercomputing Center (PSC), and Indiana University in the U.S., de.NBIVSC   cloud resources and  IFB cluster resources on the European side, STFC-IRIS at the Diamond Light Source, and ARDC cloud resources in Australia.

Galaxy Project   European Galaxy Project   Australian Galaxy Project   bioconda   XSEDE   TACC   de.NBI   ELIXIR   PSC   Indiana University   Galaxy Training Network   Bio Platforms Australia   Australian Research Data Commons   VIB   ELIXIR Belgium   Vlaams Supercomputer Center   EOSC-Life   Datamonkey   IFB  

See also: Galaxy Community