Bioschemas: Community Adoption and Training

Bioschemas (http://bioschemas.org) is a community initiative which aims to improve data discoverability in the life sciences and provide better exposure of our data repositories, including the ELIXIR Core and Node Data Resources, to generic search engines, such as Google, and domain specific repositories such as Identifiers.org, FAIRsharing.org, and DataMed. It does this by encouraging content providers in life sciences to use Schema.org markup to expose consistent structured data in their websites.

In March 2017 we started a pilot programme to build a Bioschemas community, define a first set of profiles for data repositories (catalogs), datasets, and specific data types, pilot markups for datasets primarily held by the EBI, and build a relationship with Schema.org. After an initial Implementation Study, we have succeeded in all these goals. The community momentum continues with:

Topic specific workshops focusing on datasets from a particular community or data type, e.g. Bioschemas Samples Workshop March 2018; and
Node specific workshops focusing on datasets from a particular node regardless of data type, e.g. ELIXIR-IT Bioschemas Workshop, February 2018.
Close links with the "Enabling the reuse, extension, scaling, and reproducibility of scientific workflows" implementation study.

This project aims to lift Bioschemas from “pilot” to “practice” and will:

Disseminate and Train: Develop training material; improve the Bioschemas website; make compliant datasets visible in ELIXIR registries.
Adopt: Topic specific workshops; Node-based workshops and inter-node staff exchange; strategic engagement with data platform Core Data Resources and implementation studies.
Sustain: Incorporation into Schema.org; and establish Bioschemas Community governance.
Impact: Highlight use cases, showcase impact stories and identify how to mitigate attrition.