Software containers are a key element in the frame of Open Science, Open Data & Open Source which is strongly supported and advocated by ELIXIR. Software containers are key to guarantee data provenance when described as part of scientific workflows and an important element towards results reproducibility.
They also ease software installation on local computer or cluster facilities.Thus, software containers are transversal to most of the strategic lines of the ELIXIR Tools Platform for the 2022 - 2023 Scientific programme.
We have divided this task into two work packages around software containers that compliment each other.
The first work package aims to maintain and extend the work initiated in the previous ELIXIR implementation study on BioContainers. The implementation study on BioContainers contributed to unify various initiatives in ELIXIR Nodes around software containers and bring them under a common infrastructure and metadata federation. This first work package will focus on operational maintenance of the infrastructure. Additional registry mirrors will be investigated for cloud/local registry availability. To do so, a partnership with Amazon (and potentially other commercial cloud providers) will be set.
While x86 architectures dominate the scientific compute clusters and clouds at this time (currently the container architecture supported by BioContainers), ARM architecture is a mature technology getting traction in both server and consumer markets. With the first super-computers starting to build on ARM architectures, the BioContainers project needs to be prepared to offer ARM based container solutions to our users. Therefore, a task of this work package will evaluate the multi-architecture container solution and add support for ARM in addition to x86. As this task requires extra physical resources, we will try to get support from the ARM company and/or support from ELIXIR members to build ARM-based BioContainers.
To compile software against different architectures, upstream support from tool developers will most likely be needed. Here, we will work together with the “development best practices” (Task 4) team to encourage developers to provide multi-arch support to their software and help them along the way.
BioContainers already integrates its metadata with the central repository of the Tools Platform, work will continue to align/homogenise with the repository evolutions which were, in a first step, a raw addition of metadata from all Tools Platform services (BioContainers, OpenEBench, bio.tools, ...) and the Tools Ecosystem.
Second work packages will focus on user communities. While BioContainers is widely used by the ELIXIR community, it could reach/extend to other communities around life science. Discussions with EOSC and other communities will evaluate how BioContainers could fit to their needs, and the possibility to contribute to the BioContainers project and the registry. This would provide end users a single entry-point and solution for container management and optimize human/compute resources.
As a summary, BioContainers is today well established in our community and provides a stable infrastructure for container availability and findability. Existing solutions could benefit extra life science communities. Energy efficiency trends towards ARM architecture (and possibly others in the future) should not be ignored and is a chance for BioContainers to provide its support to an emerging but growing community.