A Generative Foundation Model for Structured Patient Trajectory Data.
Journal:
AMIA ... Annual Symposium proceedings. AMIA Symposium
Published Date:
May 22, 2025
Abstract
Advancements in artificial intelligence propelled the implementation of general-purpose multitasking agents called foundation models. However, it has been challenging for foundation models to handle structured longitudinal medical data due to the mixed data types and variable timestamps in these data. Acquiring large training data is another obstacle. This study proposes a generative foundation model to manage patient trajectory data of variable lengths with mixed data types (categorical and continuous variables). Additionally, we propose a data pipeline to supply real-world data large enough to support foundation models. We locally obtained a large clinical dataset with a reproducible data pipeline scheme that leveraged a national HL7 message standard. Our trained model acquired the ability to suggest clinically relevant medical concepts and continuous variables for general purposes. The model also synthesized a database of more than 10,000 realistic patient trajectories. Our results suggest promising future downstream clinical applications of the foundation model.