ELIXIR Deposition Databases for Biomolecular Data

ELIXIR has compiled a list of resources that it recommends for the deposition of experimental data. The scientific community has a shared responsibility to ensure long-term data preservation and accessibility. The purpose of this Deposition Databases list is to provide guidance to those who formulate policy and working practices about the appropriate repositories for publishing open data in the life sciences.

An ELIXIR Deposition Database is defined as being part of the ELIXIR Node portfolio of services that accepts deposition of experimental data from an international community of researchers beyond the funding envelope of the database itself.

The selected ELIXIR Deposition Databases meet the technical quality and governance criteria expected of ELIXIR Core Data Resources (see the F1000R article “Identifying ELIXIR Core Data Resources”), which align with the FAIR principles, but may be at an earlier stage of development, meeting an emerging scientific requirement, or may be narrower in scope. Consequently some, but not all, of the ELIXIR Deposition Databases also appear in the ELIXIR Core Data Resources list.

ELIXIR Deposition Databases are defined during the Core Data Resource and Deposition Database selection process, which runs periodically (see the timeline here). Applications for inclusion in the Deposition Database list can also be made by direct suggestion to the Data Platform.

The initial ELIXIR Deposition Databases list was defined in July 2017.

ELIXIR Deposition Database list

Deposition Database	Data type	International collaboration framework ¹
ArrayExpress	Functional genomics data. Stores data from high-throughput functional genomics experiments.
BioModels	Computational models of biological processes.
BioSamples	BioSamples stores and supplies descriptions and metadata about biological samples used in research and development by academia and industry.	NCBI BioSamples database
BioStudies	Descriptions of biological studies, links to data from these studies in other databases, as well as data that do not fit in the structured archives.
EGA	Personally identifiable genetic and phenotypic data resulting from biomedical research projects.	European Bioinformatics Institute and the Centre for Genomic Regulation
EMDB	The Electron Microscopy Data Bank is a public repository for electron microscopy density maps of macromolecular complexes and subcellular structures.
EMPIAR	EMPIAR, the Electron Microscopy Public Image Archive, is a public resource for raw images underpinning 3D cryo-EM maps and tomograms (themselves archived in EMDB). EMPIAR also accommodates 3D datasets obtained with volume EM techniques and soft and hard X-ray tomography.
ENA	The European Nucleotide Archive (ENA) provides a comprehensive record of the world’s nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation.
EVA	The European Variation Archive covers genetic variation data from all species.	dbSNP and dbVAR
GWAS Catalog	The NHGRI-EBI GWAS Catalog is a curated collection of all human genome-wide association studies, produced by a collaboration between EMBL-EBI and NHGRI.
IntAct	IntAct provides a freely available, open source database system and analysis tools for molecular interaction data.	The International Molecular Exchange Consortium
LIPID MAPS®	LIPID MAPS is designed to be an open, systematic and standardised lipidomics resource. Providing information on Lipids and their structures, properties and functions in biological processes.
MetaboLights	Metabolite structures and their reference spectra as well as their biological roles, locations and concentrations, and experimental data from metabolic experiments.
ModelArchive	ModelArchive provides a unique stable accession code (DOI) for each deposited theoretical model of a macromolecular structure, which can be directly referenced in corresponding manuscripts. Besides actual model coordinates, archiving of models often includes details about assumptions, parameters and constraints applied in the simulation to allow the user of a model to assess and if necessary reproduce the simulation.
PDBe	Biological macromolecular structures.	wwPDB
PRIDE	Mass spectrometry-based proteomics data, including peptide and protein expression information (identifications and quantification values) and the supporting mass spectra evidence.	The ProteomeXchange Consortium

Further information: Get in touch with Fabio Liberante at core-resources@elixir-europe.org

¹ An International collaboration framework enables content sharing on a formal level. This is often signified by a shared Accession Number system, such that data deposited in one database becomes part of the shared data collection, and is also available through other partner portals.