DNA and RNA sequencing have become increasingly important in medical and translational research. The data generated from these techniques has led to a huge demand for secure means to store, transfer and analyse the human biomedical data that has been consented for research.
The Community takes the European Genome-phenome Archive (EGA) as its primary data source, access to which is controlled. The EGA allows an authorised user to search sequenced material, patient samples stored in biobanks, and the metadata around patients (their illnesses, treatments, outcomes). It also queries national search engines on behalf of the users. Datasets can then be downloaded into an EGA compatible cloud or cluster local to the researcher.
The Federated Human Data Community extends and generalises the system of access authorisation and secure data transfer developed in the EGA. It aims to provide a framework for the secure submission, archiving, dissemination and analysis of human biomedical data across Europe.
Goals of the Community
To provide a sustainable infrastructure for storing, coordinating and distributing human data
- The infrastructure is based on the European Genome-phenome Archive (EGA), tranSMART and Galaxy. Researchers will use the EGA to store their raw data, tranSMART to collate different data sets for preliminary analysis, and a Galaxy cloud service for further analysis.
- Local-EGA: The Community is also developing a portable submission toolkit (Local-EGA). This will allow you to deposit sensitive human data locally (and comply with national guidelines for storing that data) but enable data reuse across national boundaries. If you are part of an ELIXIR Node, you can set up a local instance of the EGA with metadata from the main EGA. This will allow people to search both your local and the main EGA at once.
- Submission REST API: the Community is developing an API that you can use to submit data to a Local-EGA programmatically.
To provide standardised tools to discover and access human data
- Local-EGAs for metadata sharing: By extending the use of Local-EGAs the Community is increasing the amount of human data that is discoverable. Local-EGAs store metadata from the main EGA, which will allow you to use the local EGA to search both the main and local EGA. You can also search and retrieve information from the Local-EGA by using the Local-EGI API, so you can build your own services based on the data available. In addition, the main EGA will gather the metadata from all the data submitted to Local-EGAs, so a search at the main EGA will allow you to find data located across all Local-EGAs.
- Beacon project: The Human Data Community is working with the Global Alliance for Genomics and Health (GA4GH) to use the beacon discovery service for resources across ELIXIR. The Beacon service provides a simple way to make data discoverable. You can query the lightweight metadata provided by a data resource (a 'beacon') to ask questions like 'Does this dataset have genomes with this allele at that position?' and get a 'Yes' or 'No' answer. Beacons represent an important step towards collaborative, responsible sharing of human genomic data. An important objective is to provide a reference implementation that will scale to support data discovery across all the ELIXIR Data Nodes.
- Regulating access to sensitive data: the Community is working with the Compute Platform to use the ELIXIR Authentication and Authorization Infrastructure (AAI) for ELIXIR resources. The AAI is a system that allows you to have a single identity across a range of different services, so you can use the same log-in for each service. The ELIXIR AAI also contains a Resource Entitlement Management System (REMS). This provides a way that you can request access online to a restricted data resource, and a Data Access Committee (DAC) for that resource can review your application. If you are granted access, you can the log in to the resource using the AAI (which verifies your right to access the data).
To develop long-term management policies for human data
- The Community is documenting long-term data storage requirements and metadata mappings needed for submitting complex heterogeneous data into the EGA.
To ensure that human data in ELIXIR services is handled in accordance with the appropriate legal framework
- The Community ensures that ELIXIR services handling human data comply with the General Data Protection Regulation (GDPR).
ELIXIR funds a number of short term, technical projects called Commissioned Services that inform future service development, drive standards adoption and engage the Nodes. The Human Data Community is involved in the following projects:
For completed projects see the Commissioned Services page.
Find out more
- Contact Gary Saunders (gary.saunders [at] elixir-europe.org) to learn how you can get involved with ELIXIR's work with human data, and how this work can help you.
- The work in the Community is based on earlier Commissioned Services:
- Data Resource Implementations for the GA4GH Data Schema (2016-17)
- ELIXIR – IMI OncoTrack scoping study on long-term data handling (2016)
- Genomic data management for TraIT using the EGA (2015-16)
- 2015-2016 Beacon project
- Genomic data management for TraIT using the EGA: Case study in submission and access integration of controlled access data with tranSMART and Galaxy to serve large European cohort studies (2015)
- Interoperable controlled-access big data transfer for ELIXIR - expanding EGA collaboration (2014)
- ELIXIR and GA4GH Beacon Team Up to Advance Genomic Data Sharing (news story)
- Brochure on ELIXIR Human data activities