Exciting progress in the ELIXIR Galaxy community

Galaxy Europe (https://usegalaxy.eu) provides a free data analysis environment for the growing number of life science researchers. Less than two years from its launch, the platform boasts over 12,000 unique users who executed more than six million analysis jobs and uploaded nearly 13 million datasets. 

Galaxy Europe is one of the three official Galaxy servers that implement a common core set of tools and reference genomes, and are open to anyone to use. With more than 2,000 scientific tools available, the platform is the biggest Galaxy instance in Europe covering many bioinformatics topics and communities.

It is hosted and managed by the University of Freiburg (part of de.NBI / ELIXIR Germany) as one of ELIXIR’s flagship activities. 

Galaxy-in-numbers: 12,000 users, 13 Million datasets, 6 Million jobs executed, 2,000 tools, 7TB of reference data.Why Galaxy?

As the analysis of diverse biomedical data becomes an increasingly complex task, researchers often face a difficult choice: either rely on proprietary software, if available, which will negatively impact on the reproducibility of their results, or use open-source software which may require more specialised programming skills and the right computing environment.

Galaxy solves this issue by offering an intuitive graphical user interface for complex analytical tasks, backed by powerful data publishing tools to support reproducible and reusable research.

As working with data becomes an integral part of life science research, Galaxy is an important part of a bioinformatics infrastructure that lowers the barriers to advanced and computationally intensive data analyses.

Towards a European network of Galaxy resources

To keep up with the growing demand, the ELIXIR Galaxy Community is building a network of data centres and High Performance Computing clusters to share their computation power in support of the Galaxy Europe users.

“To make Galaxy Europe sustainable in the long term, it is essential that it doesn’t depend on just one institution or one ELIXIR Node,” says Bjoern Gruening, Head of the Galaxy Europe team and co-Lead of the ELIXIR Galaxy Community,  “that’s why we are teaming up with our colleagues from other ELIXIR Nodes to build a truly pan-European platform open to anyone.”

The proposed Pulsar Network will connect data centers in Belgium, Czechia, Germany, Italy, Portugal, Norway, Spain, and the UK to share and distribute the computational tasks requested by Galaxy Europe users. It is currently in development and the launch is planned for July 2020.

ELIXIR AAI, Data, tools and training integration

Galaxy supports data integration, allowing direct uploads from many bioinformatics resources, such as InterMine or the European Nucleotide Archive. Users can also directly access seven Terabytes of reference data containing hundreds of reference genomes, covering biomedical as well as plant sciences.

Most recently, Galaxy Europe has made the multi-omics human panel reference data from the Personal Genome Project available, which provides free and unrestricted access to genomics data. Galaxy users can thus directly access the complete unprocessed data, as well as pregenerated results through a public data library without the need to first download and upload the data.

Galaxy is also fully integrated with ELIXIR Authentication and Authorisation Infrastructure (AAI), which enables single sign-on for users and provides the tools for secure access to sensitive data. This will be compatible with plans around access and authentication in the European Open Science Cloud as well as with national life science cloud facilities.

Alongside the typical web user interface, Galaxy Europe also offers a live instance which supports working with interactive tools, such as Jupyter Notebooks or RStudio. Registered users can start up to 10 different interactive tools and easily transfer data back and forth between Galaxy and Jupyter or RStudio. 

An integral component of these new developments is also a training provision. The ELIXIR Galaxy community provides Training Infrastructure as a Service for the Galaxy training community, which any trainers can use, free of charge. These ready-to-run virtual environments provided by Galaxy Europe are guaranteed to work with the official training materials, so trainers don’t have to waste time helping their students to install all the required software and tools. 

Meeting the needs of ELIXIR Communities

The Galaxy Community in ELIXIR and the Galaxy Europe team is continuously developing new tools to cater to the needs of different life science fields. 

Since June 2019 the ELIXIR Galaxy Community has been working on a large ELIXIR Implementation Study involving 11 ELIXIR Nodes. The goal is to expand the portfolio of analysis workflows for five existing ELIXIR Communities: Plant sciences, Marine metagenomics, Metabolomics, Proteomics and Structural bioinformatics.

The second goal is to facilitate access to data from ELIXIR Core Data Resources and ELIXIR Deposition Databases, which will allow automatic download of any given dataset, based on standardised metadata criteria.

Expanding the Galaxy Europe to new research communities and new data resources will further strengthen the position of Galaxy as an integral part of life science data infrastructure, both within and outside ELIXIR.

Galaxy Europe resources

Mon 10 February 2020