Lit-OTAR framework for extracting biological evidences from literature.

Journal: Bioinformatics (Oxford, England)
PMID:

Abstract

SUMMARY: The lit-OTAR framework, developed through a collaboration between Europe PMC and Open Targets, leverages deep learning to revolutionize drug discovery by extracting evidence from scientific literature for drug target identification and validation. This novel framework combines named entity recognition for identifying gene/protein (target), disease, organism, and chemical/drug within scientific texts, and entity normalization to map these entities to databases like Ensembl, Experimental Factor Ontology, and ChEMBL. Continuously operational, it has processed over 39 million abstracts and 4.5 million full-text articles and preprints to date, identifying more than 48.5 million unique associations that significantly help accelerate the drug discovery process and scientific research >29.9 m distinct target-disease, 11.8 m distinct target-drug, and 8.3 m distinct disease-drug relationships.

Authors

  • Santosh Tirunagari
    Department of Psychology, Middlesex University, London, United Kingdom. Correspondence to: Dr Santosh Tirunagari, Department of Psychology, Middlesex University, London, United Kingdom. s.tirunagari@mdx.ac.uk.
  • Shyamasree Saha
    Literature Services, EMBL-EBI, Wellcome Trust Genome Campus, Cambridge, UK.
  • Aravind Venkatesan
    Institut de Biologie Computationnelle (IBC), Univ. of Montpellier, Montpellier, France.
  • Daniel Suveges
    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
  • Miguel Carmona
    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
  • Annalisa Buniello
    Open Targets, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
  • David Ochoa
    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
  • Johanna McEntyre
    European Molecular Biology Laboratory (EMBL-EBI), European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, UK.
  • Ellen McDonagh
    Open Targets, European Bioinformatics Institute, European Molecular Biology Laboratory (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge CB10 1SD, United Kingdom.
  • Melissa Harrison
    Data Services Teams, EMBL's European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK.