Dynamical Label Augmentation and Calibration for Noisy Electronic Health Records
Journal:
arXiv
Published Date:
May 12, 2025
Abstract
Medical research, particularly in predicting patient outcomes, heavily relies
on medical time series data extracted from Electronic Health Records (EHR),
which provide extensive information on patient histories. Despite rigorous
examination, labeling errors are inevitable and can significantly impede
accurate predictions of patient outcome. To address this challenge, we propose
an \textbf{A}ttention-based Learning Framework with Dynamic
\textbf{C}alibration and Augmentation for \textbf{T}ime series Noisy
\textbf{L}abel \textbf{L}earning (ACTLL). This framework leverages a
two-component Beta mixture model to identify the certain and uncertain sets of
instances based on the fitness distribution of each class, and it captures
global temporal dynamics while dynamically calibrating labels from the
uncertain set or augmenting confident instances from the certain set.
Experimental results on large-scale EHR datasets eICU and MIMIC-IV-ED, and
several benchmark datasets from the UCR and UEA repositories, demonstrate that
our model ACTLL has achieved state-of-the-art performance, especially under
high noise levels.