Ensembles of natural language processing systems for portable phenotyping solutions.

Journal: Journal of biomedical informatics
Published Date:

Abstract

BACKGROUND: Manually curating standardized phenotypic concepts such as Human Phenotype Ontology (HPO) terms from narrative text in electronic health records (EHRs) is time consuming and error prone. Natural language processing (NLP) techniques can facilitate automated phenotype extraction and thus improve the efficiency of curating clinical phenotypes from clinical texts. While individual NLP systems can perform well for a single cohort, an ensemble-based method might shed light on increasing the portability of NLP pipelines across different cohorts.

Authors

  • Cong Liu
    Department of Bioengineering, University of Illinois at Chicago, 851 S Morgan St, Chicago, IL, 60607, USA.
  • Casey N Ta
    Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.
  • James R Rogers
    Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.
  • Ziran Li
    Department of Biomedical Informatics, Columbia University, New York, New York, USA.
  • Junghwan Lee
    Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.
  • Alex M Butler
    Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.
  • Ning Shang
    Columbia University, New York, New York.
  • Fabricio Sampaio Peres Kury
    Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.
  • Liwei Wang
    Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA.
  • Feichen Shen
    Department of Health Sciences Research, Rochester MN.
  • Hongfang Liu
    Department of Artificial Intelligence & Informatics, Mayo Clinic, Rochester, MN, United States.
  • Lyudmila Ena
    Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.
  • Carol Friedman
    Department of Biomedical Informatics, Columbia University, New York, New York, USA.
  • Chunhua Weng
    Department of Biomedical Informatics, Columbia University.