Transformer representation learning is necessary for dynamic multi-modal physiological data on small-cohort patients
Journal:
arXiv
Published Date:
Apr 5, 2025
Abstract
Postoperative delirium (POD), a severe neuropsychiatric complication
affecting nearly 50% of high-risk surgical patients, is defined as an acute
disorder of attention and cognition, It remains significantly underdiagnosed in
the intensive care units (ICUs) due to subjective monitoring methods. Early and
accurate diagnosis of POD is critical and achievable. Here, we propose a POD
prediction framework comprising a Transformer representation model followed by
traditional machine learning algorithms. Our approaches utilizes multi-modal
physiological data, including amplitude-integrated electroencephalography
(aEEG), vital signs, electrocardiographic monitor data as well as hemodynamic
parameters. We curated the first multi-modal POD dataset encompassing two
patient types and evaluated the various Transformer architectures for
representation learning. Empirical results indicate a consistent improvements
of sensitivity and Youden index in patient TYPE I using Transformer
representations, particularly our fusion adaptation of Pathformer. By enabling
effective delirium diagnosis from postoperative day 1 to 3, our extensive
experimental findings emphasize the potential of multi-modal physiological data
and highlight the necessity of representation learning via multi-modal
Transformer architecture in clinical diagnosis.