The ELIXIR Core Data Resources (CDRs) are collectively accessed millions of times per month by hundreds of thousands of users across the world. They are explicitly mentioned (by name, or accession number) in 17 percent of open access publications in EuropePMC and are used extensively across all fields of life science research, in both industry and academia.
These findings are reported in a recently published preprint, exploring the usage, impact and sustainability of the ELIXIR Core Data Resources. Authored by members of the ELIXIR Data Platform together with the managers of the Core Data Resources, the paper shows the critical value of ELIXIR Core Data Resources for life science research.
The authors take advantage of the data collected as part of the ELIXIR CDR selection process, and subsequent updates, which cover the years 2013-2017. For the first time, the set of ELIXIR CDRs are presented as a collective entity within the global open data infrastructure.
Infrastructure for the management of life science data that scales with the challenge
The quantity of data stored in the Core Data Resources has reached over 2.72 billion entries. While the volume of data managed by CDRs has tripled and the number of users accessing the CDRs has doubled between 2013-2017, the number of staff working in these resources has increased by only one sixth. This illustrates the value for money CDRs offer and the scalability of their technical solutions, though it raises questions about sustainability in the longer term.
Mining the full text open access scientific papers in EuropePMC, the authors found that over 57,000 papers cited ELIXIR Core Data Resources in 2017. The figures also show that the citations go far beyond bioinformatics and molecular biology, reaching virtually every domain of life science.
The challenge of sustainable funding
As many research funders and scientific publishers now mandate deposition of research data into open access data resources, ELIXIR Core Data Resources provide a stable infrastructure that facilitates reproducibility and ensures long-term preservation of scientific data.
However, the funding for many Core Data Resources is not secured beyond a very short horizon. Only four of the 19 ELIXIR Core Data Resources have enough assured funding to keep the same level of staffing one year from January 2019. This figure illustrates that funding for much of the infrastructure is from short-term grants, which does not allow for long-term development and planning.
The European resources from which ELIXIR Core Data Resources are selected represent only a fraction of life sciences data resources worldwide. The rest of the world also develops and hosts data resources, and many of these are as important to the global life sciences data ecosystem as ELIXIR Core Data Resources. Many of the global resources are also at risk from short-term and unstable funding cycles.
ELIXIR contributes to a Global Biodata Coalition that will explore how to ensure the commitment necessary to support this infrastructure so fundamental to life science research.
About ELIXIR Core Data Resources
The ELIXIR Core Data Resources define a cohort within the global life sciences infrastructure that funders and other stakeholders may use as a basis for structuring policies that support long-term sustainability, for both the Core Data Resources and the greater worldwide life sciences data infrastructure. At the moment, there are 19 Core Data Resources from seven ELIXIR Nodes.
The selection process for the Core Data Resources developed by ELIXIR provides a model for identification of other crucial resources worldwide that will allow funders to more efficiently support the worldwide life sciences data resource ecosystem. The emerging Global Biodata Coalition, supported by funders and heads of international research organisations, will use this process as a model for a worldwide effort to work towards securing long-term funding for crucial data resources.
More about Core Data Resources: https://elixir-europe.org/core-data-resources
Read the full article:
Rachel Drysdale, Charles E. Cook, Robert Petryszak, Vivienne Baillie-Gerritsen, Mary Barlow, Elisabeth Gasteiger, Franziska Gruhl, Jürgen Haas, Jerry Lanfear, Rodrigo Lopez, Nicole Redaschi, Heinz Stockinger, Daniel Teixeira, Aravind Venkatesan, ELIXIR Core Data Resource Forum, Niklas Blomberg, Christine Durinx and Johanna McEntyre.
The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences, bioRxiv 598318; doi: https://doi.org/10.1101/598318