Eco-Flow – A bioinformatics ecosystem for agri-ecology

We are excited to announce the start of our BBSRC- “Bioinformatics and Biological Resources Fund” grant to help build next generation agri-ecology workflows and their respective communities.

We are a team of ecologists (Seirian Sumner), bioinformaticians (Yannick Wurm), Chris Wyatt and pipeline developers (Seqera). Our aim is to develop bioinformatics resources to ease the bottleneck of bioinformaticians in the agri-ecological research community.

WHY:

Agri-ecology research is critical for our understanding of the natural environment and the security of our food supply. In the UK, we have some of the world leading research labs trying to tackle these key issues. However, bioinformatics skills in these areas lag far behind other research fields such as medicine; this bottleneck is holding back research and the speed at which this critical research gets published, with a knock-on effect to hampering recommendations for urgently-needed policy change.

SO, WHAT IS LACKING?

One of the main problems which current bioinformatic approaches is reproducibility. Code is often published with a list of bash scripts that may/may not run on your operating system, could have dependency issues, are not well documented or lack critical test data. This makes it near impossible to reproduce and slows down our ability to truly review the current literature. It also wastes a huge number of human hours.

HOW WILL WE IMPROVE THE STATUS QUO?

1. TRAINING: We address this mismatch by building resource bridges between ecologists – especially those working at the interface of insect ecology and agriculture – and bioinformatics; we will empower them with the autonomy to access and use the most appropriate cutting-edge analyses. Let’s remove the black box of bioinformatics!

2. BUILDING: We will develop (within end-user communities) reproducible, easy-to-use, modular bioinformatics pipelines suitable for the most useful analyses required by ecologists in the application of omics data to address critical ecological questions. Specifically, we will use Nextflow, a world leading pipeline development structure to create these pipelines.

3. TESTING: Once we have developed our pipelines, we will ensure each module has unit testing (using nf-test), as well as end to end pipeline tests with example data. Alongside our ambassadors and partner labs we can ensure pipelines are continuously tested and improved, so they remain active and relevant.

WHAT KINDS OF PIPELINES WILL WE CREATE?

In short: whatever the community most need! Here are some examples of the kinds of pipelines we may work on. But ultimately, the choices will be decided in consultation with the target end-users (currently the UK agri-ecology community). It could be related to genomics, transcriptomics, image processing, eDNA, AI or any other relevant technique that could be improved through reproducible pipelines.

WHAT’S HAPPENING NOW?

In early 2024, we will be canvassing the UK ecology community for their needs – the collaborative pipelines they need help with – in order to deliver the pipelines with the most impact. We will also be providing training opportunities for end-users, and looking for datasets to test the pipelines on. Please get in contact with us if you have an agri-ecology project with a bioinformatics bottle-neck, have a dataset sitting on a server that you need help moving to analysis, or would like to learn more or get involved. We are happy to talk!

Contact: c.wyatt@ucl.ac.uk