Linking biological data using data science and cross-disciplinary software development

Florian Huber

At the Netherlands eScience Center we contribute with 50+ domain- and computer scientists to an unusually broad range of scientific projects across all quantitative scientific domains. This triggers cross-disciplinary exchange, the strength of which will be illustrated by presenting an ongoing project on linking metabolomic and genetic data. Computational analysis of genomes can predict biosynthetic gene clusters (BGCs) responsible for producing complex biochemical compounds, many of which are still unknown. Adapting machine-learning tools from different domains, we develop a novel method aiming to pinpoint causal links between predicted BGCs and yet unidentified compound signals detected in rich metabolite mixtures.

🎥 This talk was recorded on video and is available at

🖥 There is a PDF file with the slides for this talk available. Unless otherwise noted in the slides themselves, they are published under a CC BY 3.0 license.
CC BY 3.0 badge