OPTIC: Optimizing Patient-Provider Triaging & Improving Communications in Clinical Operations using GPT-4 Data Labeling and Model Distillation
Journal:
arXiv
Published Date:
Feb 5, 2025
Abstract
The COVID-19 pandemic has accelerated the adoption of telemedicine and
patient messaging through electronic medical portals (patient medical advice
requests, or PMARs). While these platforms enhance patient access to
healthcare, they have also increased the burden on healthcare providers due to
the surge in PMARs. This study seeks to develop an efficient tool for message
triaging to reduce physician workload and improve patient-provider
communication. We developed OPTIC (Optimizing Patient-Provider Triaging &
Improving Communications in Clinical Operations), a powerful message triaging
tool that utilizes GPT-4 for data labeling and BERT for model distillation. The
study used a dataset of 405,487 patient messaging encounters from Johns Hopkins
Medicine between January and June 2020. High-quality labeled data was generated
through GPT-4-based prompt engineering, which was then used to train a BERT
model to classify messages as "Admin" or "Clinical." The BERT model achieved
88.85% accuracy on the test set validated by GPT-4 labeling, with a sensitivity
of 88.29%, specificity of 89.38%, and an F1 score of 0.8842. BERTopic analysis
identified 81 distinct topics within the test data, with over 80% accuracy in
classifying 58 topics. The system was successfully deployed through Epic's
Nebula Cloud Platform, demonstrating its practical effectiveness in healthcare
settings.