Inference-Time Gaze Refinement for Micro-Expression Recognition: Enhancing Event-Based Eye Tracking with Motion-Aware Post-Processing
Journal:
arXiv
Published Date:
Jun 14, 2025
Abstract
Event-based eye tracking holds significant promise for fine-grained cognitive
state inference, offering high temporal resolution and robustness to motion
artifacts, critical features for decoding subtle mental states such as
attention, confusion, or fatigue. In this work, we introduce a model-agnostic,
inference-time refinement framework designed to enhance the output of existing
event-based gaze estimation models without modifying their architecture or
requiring retraining. Our method comprises two key post-processing modules: (i)
Motion-Aware Median Filtering, which suppresses blink-induced spikes while
preserving natural gaze dynamics, and (ii) Optical Flow-Based Local Refinement,
which aligns gaze predictions with cumulative event motion to reduce spatial
jitter and temporal discontinuities. To complement traditional spatial accuracy
metrics, we propose a novel Jitter Metric that captures the temporal smoothness
of predicted gaze trajectories based on velocity regularity and local signal
complexity. Together, these contributions significantly improve the consistency
of event-based gaze signals, making them better suited for downstream tasks
such as micro-expression analysis and mind-state decoding. Our results
demonstrate consistent improvements across multiple baseline models on
controlled datasets, laying the groundwork for future integration with
multimodal affect recognition systems in real-world environments.