Structural bioinformatics provides methods and tools to analyse, predict, archive and validate the three-dimensional (3D) structure data of biomacromolecules such as proteins, RNA or DNA.
The specific 3D shapes of macromolecules allow them to perform many functions within cells. Understanding their structures is therefore crucial for understanding the interactions and functions of cells, which in turn opens up potential for innovations in biotechnology and drug development.
The recent developments in experimental methods to determine the structure of a macromolecule have led to rapid expansion of structural data, both quantitatively and qualitatively. European structural bioinformatics groups - members of the ELIXIR 3D-BioInfo Community - played a crucial role in validating these data and developed various tools for working with them. ELIXIR Nodes are responsible for the development and sustainable operation of a number of tools used by the structural biology community, including three ELIXIR Core Data Resources: PDBe, CATH and InterPro.
The Communities white paper was published in April 2020, A community proposal to integrate structural bioinformatics activities in ELIXIR (3D-Bioinfo Community). Christine Orengo, Structural and Molecular Biology Department, UCL, UK leads the Community together with five Task leads (see below).
There are five main areas of interest for the 3D-BioInfo Community:
Activity 1: To develop the infrastructure for FAIR structural and functional annotations
There are many small data resources that derive added value annotations from the Protein Data Bank (PDBe) and the Electron Microscopy Data Bank (EMDB). However, a lack of data standards and uniform data access mechanisms significantly reduce the impact of these specialist data resources. To address this, the Community will:
- Support the development of the PDBe Knowledge Base (PDBe-KB) – a community-driven data resource for structural and functional annotations that places structural data in its biological context. PDBe-KB will increase the visibility and interoperability of niche data resources and enable comparisons between specific types of annotations obtained from different software tools.
- Establish data standards for different types of annotations and integration of these annotations using a community-driven data exchange format and a uniform data access mechanism.
- Develop a network of 3D-Beacons by integrating annotations from PDBe-KB with structural models from existing European and other international archives. This will lead to increased coverage of structure data in the sequence space.
This activity is coordinated by Sameer Velankar, Protein Data Bank in Europe, EMBL, UK.
Activity 2: To create open resources for sharing, integrating and benchmarking software tools for modelling the proteome in 3D
The 3D-BioInfo Community will:
- Develop tools to allow the scientific community to extend the current information on protein 3D structures, interactions and assemblies and extract knowledge from it. The initial focus will be on software tools and community-wide benchmarking for modelling 3D structures and conformational flexibility of proteins and protein assemblies.
- Extend the content of ELIXIR benchmarking platform, OpenEBench, by adding software tools for structural biology. It will also include workflows and guidelines for modelling 3D structures of proteins, protein complexes and assemblies, based on known structures available in the PDBe.
- Develop tools for evaluating the quality of 3D models of proteins and protein complexes. This will improve the evaluation of the molecular modelling methods, and help developers optimize their procedures.
- Develop standard quality measures and evaluation protocols for 3D structure modelling tools, building on the work by the CAPRI and CAMEO communities.
- Develop a one-stop-shop of benchmark datasets for testing and evaluating methods for generating scoring, and ranking models of protein complexes. The wide research community will also be invited to contribute with their own datasets, following well-defined community-approved standards.
- Develop necessary infrastructure for managing the CAPRI challenge (automated registration and submission, tools for accessing and navigating target information, predicted models and results).
- Develop a knowledge portal providing access to workflows and guidelines to various tools for modelling conformational flexibility.
This activity is coordinated by Shoshana Wodak, VIB-VUB, CSB, BE.
Activity 3: To help develop models for protein-ligand interactions
In silico technologies for modeling the interactions of proteins with drug-like compounds (ligands) can speed up the discovery of new medications and reduce the cost of the drug discovery process. To support these approaches the 3D-BioInfo Community will:
- Develop benchmark datasets for assessing Structure-Based Drug Design (SBDD) tools on a large-scale and under well-defined FAIR conditions, complementing efforts such as the Drug Design Data Resources grand challenges.
- Quantify different properties (e.g. charge, polarity, size, flexibility of the binding site and ligand etc.) for each entry in the benchmark, to enable evaluation of SBDD tools as a function of these properties.
- Develop links to other databases and standardise the retrieved data to complement the information provided for each protein-ligand complex in the benchmark sets.
- Add information on the non-bioactive conformations of ligands to standardise the comparison of docking calculations starting from such geometries.
- Add information on experimentally determined non-active compounds to be used as negative examples for testing virtual screening procedures.
- Make all benchmark datasets, benchmark workflows, and benchmarking results publicly available via the OpenEBench platform.
- Developing tutorials and other training materials for structure-based modelling and interpretation of their results.
This activity is coordinated by Vincent Zoete, Department of Oncology, SIB, CH.
Activity 4: To develop tools to Describe, Analyse, Annotate, and Predict Nucleic Acid Structures
The ultimate goal in this area is to encourage the development and use of software tools to describe, analyse, annotate, and predict nucleic acid (NA) structures. In particular the Community will:
- Catalogue software tools for building nucleic acid models based on their sequences alone as well as for modelling their 3D structures using experimental data, and facilitate the integration of these tools.
- Coordinate the unification of the existing NA geometry standards and formulate specifications for missing standards.
- Develop benchmarks dataset for evaluating the quality of predicted or experimentally determined NA structures.
- Continuously update the catalogue of software tools developed by the RNA tools and software consortium, and extend these tools to DNA structures.
- Closely collaborate with the experimental structural biology communities (Instruct-ERIC and EuroBioImaging) to ensure consistency across the different research communities.
This activity is coordinated by Bohdan Schneider, Institute of Biotechnology, CAS, CZ.
Activity 5: To establish a Biostudies database of protein engineering results
Fully predictable engineering of proteins to adopt desired structures and to exhibit desired functions &/or physical properties remains a key challenge. This activity seeks to collect data on the results of different approaches to protein engineering and design, with the underlying philosophy that we can learn as much from designs that fail as we can from designs that succeed.
To achieve this goal, we will establish a Biostudies database as a focal point where researchers can deposit such data.
- We will establish a resource in which commonly used tools that facilitate protein engineering are collected, alongside user feedback and comments.
- We will lead activities to facilitate the cross-fertilisation of ideas in the area of protein engineering and design, with particular emphasis on the support of networking by junior scientists.
The protein engineering activity is unique in involving wet-lab scientists alongside computational biologists. Including wet-lab practitioners facilitates a deep and meaningful assessment of database content. The protein engineering activity thus productively interacts with the other 3DBio activities.
This activity is coordinated by Lynne Regan, Centre for Synthetic and Systems Biology, UoE, UK.
FEBS enzyme engineering course
25-29 September 2023, Zagreb, Croatia
Community members might be interested in this course is aimed at disseminating important _in silico_ tools, such that experimental researchers, Ph.D. students, and young postdocs, can apply them in their research projects. There will be lectures by experts concerning the development of computational tools as well as by experimentalists who use these tools. See the event website.
ELIXIR funds a number of short-term technical projects called Commissioned Services that inform future service development, drive standards adoption and engage ELIXIR Nodes. The 3D-BioInfo Community completed the following project in November 2019:
The Community is led by a seven-person Executive Committee:
Find out more
- Contact katharina.heil [at] elixir-europe.org if you would like to know more about the Community's work.
- Documents from the 2020 3D-BioInfo Community AGM
- Slides about the 3D-BioInfo ELIXIR Implementation Study presented at the ELIXIR All Hands meeting 2019.