nala: text mining natural language mutation mentions.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: The extraction of sequence variants from the literature remains an important task. Existing methods primarily target standard (ST) mutation mentions (e.g. 'E6V'), leaving relevant mentions natural language (NL) largely untapped (e.g. 'glutamic acid was substituted by valine at residue 6').

Authors

  • Juan Miguel Cejuela
    TUM, Department of Informatics, Bioinformatics & Computational Biology - i12, Garching, Munich, Germany.
  • Aleksandar Bojchevski
    TUM, Department of Informatics, Bioinformatics & Computational Biology - i12, Garching, Munich, Germany.
  • Carsten Uhlig
    TUM, Department of Informatics, Bioinformatics & Computational Biology - i12, Garching, Munich, Germany.
  • Rustem Bekmukhametov
    TUM, Department of Informatics, Bioinformatics & Computational Biology - i12, Garching, Munich, Germany.
  • Sanjeev Kumar Karn
    TUM, Department of Informatics, Bioinformatics & Computational Biology - i12, Garching, Munich, Germany.
  • Shpend Mahmuti
    TUM, Department of Informatics, Bioinformatics & Computational Biology - i12, Garching, Munich, Germany.
  • Ashish Baghudana
    TUM, Department of Informatics, Bioinformatics & Computational Biology - i12, Garching, Munich, Germany.
  • Ankit Dubey
    TUM, Department of Informatics, Bioinformatics & Computational Biology - i12, Garching, Munich, Germany.
  • Venkata P Satagopam
    Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Belvaux, Luxembourg.
  • Burkhard Rost