Most of what we know about human diseases and genes/proteins involved in them is buried in the biomedical literature. Text mining is thus a key tool for creating databases with structured data on disease-gene associations and the underlying gene networks.
In this webinar, Lars Juhl Jensen from the Novo Nordisk Foundation Center for Protein Research at the University of Copenhagen, Denmark, described a highly efficiently text-mining pipeline, which allows his team to systematically mine such information from very large corpora of full-text articles as well as abstracts. The presentation also covered how this information is made available through interoperable, web-based databases, including the DISEASES database and the STRING database (an ELIXIR Core Data Resource).
About Lars Juhl Jensen
Lars Juhl Jensen is a professor at the Novo Nordisk Foundation Center for Protein Research at the University of Copenhagen in Denmark. His research group develops state-of-the-art tools for generation and analysis of molecular interaction networks from proteomics data and text mining. The tools are freely available to the scientific community.
Lars Juhl Jensen started his research career in Søren Brunak’s group at the Technical University of Denmark, from where he in 2002 received the Ph.D. degree in bioinformatics for his work on non-homology based protein function prediction. During this time, he also developed methods for visualization of microbial genomes, pattern recognition in promoter regions, and microarray analysis. From 2003 to 2008, he was at the European Molecular Biology Laboratory (EMBL) where he worked on literature mining, integration of large-scale experimental datasets, and analysis of biological interaction networks. Since 2009, he has continued this line of research as a professor at the Novo Nordisk Foundation Center for Protein Research and as a founder, owner and scientific advisor of Intomics A/S.
He is a co-author of more than 180 scientific publications that have in total received more than 25,000 citations. He was awarded the Lundbeck Foundation Talent Prize in 2003, his work on cell-cycle research was named “Break-through of the Year” in 2006 by the magazine Ingeniøren, his work on text mining won the first prize in the “Elsevier Grand Challenge: Knowledge Enhancement in the Life Sciences” in 2009, and he was awarded the Lundbeck Foundation Prize for Young Scientists in 2010.