Automated Mapping of Real-world Oncology Laboratory Data to LOINC.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium
PMID:

Abstract

In this study we seek to determine the efficacy of using automated mapping methods to reduce the manual mapping burden of laboratory data to LOINC(r) on a nationwide electronic health record derived oncology specific dataset. We developed novel encoding methodologies to vectorize free text lab data, and evaluated logistic regression, random forest, and knn machine learning classifiers. All machine learning models did significantly better than deterministic baseline algorithms. The best classifiers were random forest and were able to predict the correct LOINC code 94.5% of the time. Ensemble classifiers further increased accuracy, with the best ensemble classifier predicting the same code 80.5% of the time with an accuracy of 99%. We conclude that by using an automated laboratory mapping model we can both reduce manual mapping time, and increase quality of mappings, suggesting automated mapping is a viable tool in a real-world oncology dataset.

Authors

  • Jonathan Kelly
    Flatiron Health Inc, New York, New York.
  • Chen Wang
    Department of Cardiovascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
  • Jianyi Zhang
    Georgetown University, Washington D.C.
  • Spandan Das
    Flatiron Health Inc, New York, New York.
  • Anna Ren
    Flatiron Health Inc, New York, New York.
  • Pradnya Warnekar
    Flatiron Health Inc, New York, New York.