Automated ARAT Scoring Using Multimodal Video Analysis, Multi-View Fusion, and Hierarchical Bayesian Models: A Clinician Study
Journal:
arXiv
Published Date:
May 3, 2025
Abstract
Manual scoring of the Action Research Arm Test (ARAT) for upper extremity
assessment in stroke rehabilitation is time-intensive and variable. We propose
an automated ARAT scoring system integrating multimodal video analysis with
SlowFast, I3D, and Transformer-based models using OpenPose keypoints and object
locations. Our approach employs multi-view data (ipsilateral, contralateral,
and top perspectives), applying early and late fusion to combine features
across views and models. Hierarchical Bayesian Models (HBMs) infer movement
quality components, enhancing interpretability. A clinician dashboard displays
task scores, execution times, and quality assessments. We conducted a study
with five clinicians who reviewed 500 video ratings generated by our system,
providing feedback on its accuracy and usability. Evaluated on a stroke
rehabilitation dataset, our framework achieves 89.0% validation accuracy with
late fusion, with HBMs aligning closely with manual assessments. This work
advances automated rehabilitation by offering a scalable, interpretable
solution with clinical validation.