FoMoH: A clinically meaningful foundation model evaluation for structured electronic health records
Journal:
arXiv
Published Date:
May 22, 2025
Abstract
Foundation models hold significant promise in healthcare, given their
capacity to extract meaningful representations independent of downstream tasks.
This property has enabled state-of-the-art performance across several clinical
applications trained on structured electronic health record (EHR) data, even in
settings with limited labeled data, a prevalent challenge in healthcare.
However, there is little consensus on these models' potential for clinical
utility due to the lack of desiderata of comprehensive and meaningful tasks and
sufficiently diverse evaluations to characterize the benefit over conventional
supervised learning. To address this gap, we propose a suite of clinically
meaningful tasks spanning patient outcomes, early prediction of acute and
chronic conditions, including desiderata for robust evaluations. We evaluate
state-of-the-art foundation models on EHR data consisting of 5 million patients
from Columbia University Irving Medical Center (CUMC), a large urban academic
medical center in New York City, across 14 clinically relevant tasks. We
measure overall accuracy, calibration, and subpopulation performance to surface
tradeoffs based on the choice of pre-training, tokenization, and data
representation strategies. Our study aims to advance the empirical evaluation
of structured EHR foundation models and guide the development of future
healthcare foundation models.