Triage of documents containing protein interactions affected by mutations using an NLP based machine learning approach.
Journal:
BMC genomics
Published Date:
Nov 10, 2020
Abstract
BACKGROUND: Information on protein-protein interactions affected by mutations is very useful for understanding the biological effect of mutations and for developing treatments targeting the interactions. In this study, we developed a natural language processing (NLP) based machine learning approach for extracting such information from literature. Our aim is to identify journal abstracts or paragraphs in full-text articles that contain at least one occurrence of a protein-protein interaction (PPI) affected by a mutation.