Scalable Curation (2022-23)

Most data and literature curation processes are initiated via some entity-centric query (e.g. gene or gene products, a disease, a chemical compound). However, most databases are also interested in accessing and curating contents using other types of modalities: some biological phenomena (e.g. Intrinsically Disordered Proteins) or some domain-specific aspects of biology (e.g. lipidomics, glycomics, rare diseases). These are not easily expressed via a combination of
keywords; therefore non entity-centric literature exploration tools are needed.

Further, literature curation will expand beyond abstracts to include full-text, supplementary data and pre-prints (in several versions). This will be the focus of Task.3 in 22-23.

  1. Literature screening and alerting services to speed up curation: unlike most curation-support tools, which are entity centric (i.e. curation workflows start with a gene), this WP aims at triggering an alert as soon as a newly relevant article for a given database is published. 
  2. Triage-as-a-Service for any published data, including full-text, pre-prints and supplementary data: Expand triage services to full-text articles and supplementary data (in connection with T2.1).
  3. Ongoing community requests for additional connectivity between ELIXIR data resources and the literature are expected in the period 22-23, and this WP will be carried forward as is.
  4. Monitor and assess the ongoing Data Curation Implementation studies that arise out of the 2021 RFP process.