HRTR: A Single-stage Transformer for Fine-grained Sub-second Action Segmentation in Stroke Rehabilitation
Journal:
arXiv
Published Date:
Jun 3, 2025
Abstract
Stroke rehabilitation often demands precise tracking of patient movements to
monitor progress, with complexities of rehabilitation exercises presenting two
critical challenges: fine-grained and sub-second (under one-second) action
detection. In this work, we propose the High Resolution Temporal Transformer
(HRTR), to time-localize and classify high-resolution (fine-grained),
sub-second actions in a single-stage transformer, eliminating the need for
multi-stage methods and post-processing. Without any refinements, HRTR
outperforms state-of-the-art systems on both stroke related and general
datasets, achieving Edit Score (ES) of 70.1 on StrokeRehab Video, 69.4 on
StrokeRehab IMU, and 88.4 on 50Salads.