Evaluation of the portability of computable phenotypes with natural language processing in the eMERGE network.

Journal: Scientific reports
Published Date:

Abstract

The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance of existing algorithms using electronic health records (EHRs). Based on scientific merit and predicted difficulty, eMERGE selected six existing phenotypes to enhance with NLP. We assessed performance, portability, and ease of use. We summarized lessons learned by: (1) challenges; (2) best practices to address challenges based on existing evidence and/or eMERGE experience; and (3) opportunities for future research. Adding NLP resulted in improved, or the same, precision and/or recall for all but one algorithm. Portability, phenotyping workflow/process, and technology were major themes. With NLP, development and validation took longer. Besides portability of NLP technology and algorithm replicability, factors to ensure success include privacy protection, technical infrastructure setup, intellectual property agreement, and efficient communication. Workflow improvements can improve communication and reduce implementation time. NLP performance varied mainly due to clinical document heterogeneity; therefore, we suggest using semi-structured notes, comprehensive documentation, and customization options. NLP portability is possible with improved phenotype algorithm performance, but careful planning and architecture of the algorithms is essential to support local customizations.

Authors

  • Jennifer A Pacheco
    Northwestern University, Feinberg School of Medicine, Chicago, IL, USA.
  • Luke V Rasmussen
    Northwestern University, Feinberg School of Medicine, Chicago, IL, USA.
  • Ken Wiley
    National Human Genome Research Institute, Bethesda, USA.
  • Thomas Nate Person
    Pennsylvania State University, Hershey, USA.
  • David J Cronkite
    Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA.
  • Sunghwan Sohn
    Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, USA.
  • Shawn Murphy
    From Research Information Systems and Computing (V.M.C., V.G., S.M.), Partners Healthcare; Boston Children's Hospital Informatics Program (D.D., S.F., G.S.); Harvard Medical School (D.D., S.Y., A.C., M.A.-E.-B., N.A.S., S.M., S.T.W., R.D.); Department of Medicine (S.Y., S.T.W.), Department of Neurosurgery (A.C., M.A.-E.-B., R.D.), Division of Rheumatology, Immunology and Allergy (N.A.S.), and Channing Division of Network Medicine (S.T.W., R.D.), Brigham and Women's Hospital, Boston, MA; Center for Statistical Science (S.Y.), Tsinghua University, Beijing, China; Department of Neurology (S.M.), Massachusetts General Hospital; and Biostatistics (T.C.), Harvard School of Public Health, Boston, MA.
  • Justin H Gundelach
    Mayo Clinic, Rochester, USA.
  • Vivian Gainer
  • Victor M Castro
  • Cong Liu
    Department of Bioengineering, University of Illinois at Chicago, 851 S Morgan St, Chicago, IL, 60607, USA.
  • Frank Mentch
    Children's Hospital of Philadelphia, Philadelphia, USA.
  • Todd Lingren
    Department of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA.
  • Agnes S Sundaresan
    Geisinger, Danville, USA.
  • Garrett Eickelberg
    Northwestern University, Evanston, USA.
  • Valerie Willis
    National Human Genome Research Institute, Bethesda, USA.
  • Al'ona Furmanchuk
    Northwestern University, Chicago, Illinois, USA.
  • Roshan Patel
    Geisinger, Danville, USA.
  • David S Carrell
    Group Health Research Institute, Seattle, WA, 98101, USA.
  • Yu Deng
    National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, People's Republic of China.
  • Nephi Walton
    Intermountain Healthcare, Salt Lake City, USA.
  • Benjamin A Satterfield
    Mayo Clinic, Rochester, USA.
  • Iftikhar J Kullo
    Department of Cardiovascular Diseases, Mayo Clinic, Rochester, Minn.
  • Ozan Dikilitas
    Mayo Clinic, Rochester, USA.
  • Joshua C Smith
    Vanderbilt University Medical Center, Nashville, TN.
  • Josh F Peterson
    Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
  • Ning Shang
    Columbia University, New York, New York.
  • Krzysztof Kiryluk
    Columbia University, New York, USA.
  • Yizhao Ni
    Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH.
  • Yikuan Li
    Department of EECS, Northwestern University, Chicago, IL, U.S.A.
  • Girish N Nadkarni
    Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA.
  • Elisabeth A Rosenthal
    University of Washington, Seattle, USA.
  • Theresa L Walunas
    Northwestern University, Evanston, USA.
  • Marc S Williams
  • Elizabeth W Karlson
    Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Boston, MA 02115, USA Harvard Medical School, Boston.
  • Jodell E Linder
    Vanderbilt University Medical Center, Nashville, USA.
  • Yuan Luo
    Department of Preventive Medicine, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA.
  • Chunhua Weng
    Department of Biomedical Informatics, Columbia University.
  • WeiQi Wei
    Vanderbilt University Medical Center, Nashville, USA.