Time-Series U-Net with Recurrence for Noise-Robust Imaging Photoplethysmography
Journal:
arXiv
Published Date:
Mar 21, 2025
Abstract
Remote estimation of vital signs enables health monitoring for situations in
which contact-based devices are either not available, too intrusive, or too
expensive. In this paper, we present a modular, interpretable pipeline for
pulse signal estimation from video of the face that achieves state-of-the-art
results on publicly available datasets.Our imaging photoplethysmography (iPPG)
system consists of three modules: face and landmark detection, time-series
extraction, and pulse signal/pulse rate estimation. Unlike many deep learning
methods that make use of a single black-box model that maps directly from input
video to output signal or heart rate, our modular approach enables each of the
three parts of the pipeline to be interpreted individually. The pulse signal
estimation module, which we call TURNIP (Time-Series U-Net with Recurrence for
Noise-Robust Imaging Photoplethysmography), allows the system to faithfully
reconstruct the underlying pulse signal waveform and uses it to measure heart
rate and pulse rate variability metrics, even in the presence of motion. When
parts of the face are occluded due to extreme head poses, our system explicitly
detects such "self-occluded" regions and maintains estimation robustness
despite the missing information. Our algorithm provides reliable heart rate
estimates without the need for specialized sensors or contact with the skin,
outperforming previous iPPG methods on both color (RGB) and near-infrared (NIR)
datasets.