KYBELΕ: Knowledge Yield from BiodivΕrsity Literature through Large Language Model (LLM) Extraction

KYBELE develops an AI-powered, FAIR-compliant system to unlock biodiversity knowledge hidden in scientific literature. By harvesting species data, ecological traits and habitat information, and annotating BiodiversityPMC with domain-specific vocabularies, the project creates structured datasets for reuse. A fine-tuned LLM will power an interactive literature-exploration chatbox deployed via ELIXIR and LifeWatch ERIC. KYBELE directly advances BFSP priorities by enabling scalable, automated extraction of biodiversity knowledge.

Predicted outcomes:

  • FAIR-aligned catalogues of species, traits and habitats
  • Annotated BiodiversityPMC enriched with domain vocabularies
  • Public LLM-powered biodiversity literature exploration service
  • Containerised extraction workflows integrated with ELIXIR/LifeWatch ERIC
  • FAIR training materials and documentation for adoption
Duration: 2026 to 2027