ELIXIR today publishes the initial list of ELIXIR Core Data Resources - data resources of fundamental importance to the life science community and the long-term preservation of biological data.
The ELIXIR Core Data Resources serve as a mark of the highest quality in infrastructure service provision and will drive ELIXIR’s discussions with funders and policy-makers on the sustainability of life science data resources.
The biomedical sciences have a long tradition of open sharing of data: genomic sequence and protein structure projects have, for example, provided datasets as soon as they are generated to open public archives. This tradition extends across the spectrum of life science research and is exemplified in the Nucleic Acids Research database catalogue, which now lists over 1500 databases. Many databases incorporate careful annotation – often by manual curation – whereby experimental information is extracted from the primary scientific literature to augment research data, and is housed in structured knowledgebases.
This overall open data infrastructure plays a critical role in the life sciences, transcending reuse for basic research into translational and biotechnological applications forming a fundamental component of the digital knowledge economy. Managing these resources requires database managers and funders to understand their usage and role in scientific research, as well as their wider impact in society and industry.
ELIXIR Core Data resources are absolutely critical for the integrity and advancement of life science research,” says Christine Durinx (Associate Director, SIB Swiss Institute of Bioinformatics), one of the co-Leaders of ELIXIR’s Data Platform. “Researchers around the world rely on the ability to freely deposit into and download data from these resources. They also provide a critical role for funders and scientific publishers. If for any reason we were to lose access to these Core Data Resources, it would have a devastating effect not only on science, but also on medicine, industry and innovation.”
The Core Data Resources form the backbone of ELIXIR’s sustainability strategy. Careful evaluation of the Core Data Resources’ usage can provide reliable measures of their scientific and economic value and highlight the benefits of sustainable infrastructure for open biological data. In the long-term, the process should pave the way towards a more global and long-term approach to the funding of core bioinformatics resources.
The selection process and indicators used in the evaluation of the candidate Core Data Resources were presented to the scientific community in a paper published last October. Grouped into five categories (scientific focus and quality of science; community served by the resource; quality of the service; legal and funding infrastructure, and governance; and impact and translational stories), the indicators framed the review process which involved external reviewers as well as the ELIXIR Heads of Nodes, i.e. the Directors of national bioinformatics infrastructures, and Scientific Advisory Board. The indicators serve as quality benchmarks and drive the development of high performance, open access biological data resources that serve the life-science community.
Jo McEntyre (EMBL-EBI), co-Leader of the ELIXIR Data Platform said: “We are already seeing the first benefits of the process to identify the Core Data Resources in terms of improving ELIXIR’s capacity to deliver data resources that meet the scientific need. For example, as a result of the evaluation process, we have seen data resources change their license to align with ELIXIR’s Open Access principles, allowing more extensive data reuse not only for basic research but for industry too.”
ELIXIR Deposition Databases
In addition to the list of Core Data Resources, ELIXIR has compiled a list of databases that it recommends for the deposition of experimental data. The purpose of this list is to provide guidance to journals and funders on the appropriate repositories in which to publish open data in the life sciences. ELIXIR Deposition Databases meet the technical quality and governance criteria expected of ELIXIR Core Data Resources, but may be at an earlier stage of development, meeting an emerging scientific requirement, or narrower in scope. Consequently some, but not all, of the ELIXIR Deposition Databases also appear in the list of ELIXIR Core Data Resources.
The ELIXIR Core Data Resources and the ELIXIR Deposition Databases necessarily demonstrate long-term quality and consistency, but neither of the lists are static. The selection and evaluation process will take place regularly and further resources will be included as the ELIXIR data infrastructure evolves, accommodating emerging technologies and changing scientific needs.
ELIXIR will establish a discussion forum for the Core Data Resources to consider how best to communicate the details and outcomes of the selection process and how to present the Core Data Resources to different audiences (e.g. researchers, funders, publishers or industry). The results will also feed into the initiative to establish a global coalition to sustain core data resources that was launched in March 2017 led by the Human Frontier Science Program Organization.
This announcement is the culmination of months of committed effort by many people: the ELIXIR Heads of Nodes and Scientific Advisory Board, the panel of external expert reviewers who evaluated the candidate resource applications, and - not least - the representatives of all the ELIXIR data resources who embraced the project and participated in the process. Additionally, this work is only possible because of the culture of open data whereby researchers throughout the life sciences actively share their research in publicly accessible data resources. ELIXIR is grateful to all those who joined with us, in whatever capacity, to embark on this challenging but important exercise.