UMLS-based data augmentation for natural language processing of clinical research literature.

Journal: Journal of the American Medical Informatics Association : JAMIA
Published Date:

Abstract

OBJECTIVE: The study sought to develop and evaluate a knowledge-based data augmentation method to improve the performance of deep learning models for biomedical natural language processing by overcoming training data scarcity.

Authors

  • Tian Kang
    Department of Biomedical Informatics, Columbia University, New York, USA.
  • Adler Perotte
    Department of Biomedical Informatics, Columbia University, New York, New York, USA.
  • Youlan Tang
    Institute of Human Nutrition, Columbia University, New York, NY, USA.
  • Casey Ta
    Department of Biomedical Informatics, Columbia University, New York, New York, USA.
  • Chunhua Weng
    Department of Biomedical Informatics, Columbia University.