Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.

Journal: PloS one

Published Date: Jul 8, 2025

Abstract

Human Activity Recognition (HAR) plays a pivotal role in video understanding, with applications ranging from surveillance to virtual reality. Skeletal data has emerged as a robust modality for HAR, overcoming challenges such as noisy backgrounds and lighting variations. However, current Graph Convolutional Network (GCNN)-based methods for skeletal activity recognition face two key limitations: (1) they fail to capture dynamic changes in node affinities induced by movements, and (2) they overlook the interplay between spatial and temporal information critical for recognizing complex actions. To address these challenges, we propose ASTM‑Net, an Activity‑aware SpatioTemporal Multi‑branch graph convolutional network comprising two novel modules. First, the Activity‑aware Spatial Graph convolution Module (ASGM) dynamically models Activity‑Aware Adjacency Graphs (3A‑Graphs) by fusing a manually initialized physical graph, a learnable graph optimized end‑to‑end, and a dynamically inferred, activity‑related graph-thereby capturing evolving spatial affinities. Second, we introduce the Temporal Multi‑branch Graph convolution Module (TMGM), which employs parallel branches of channel‑reduction, dilated temporal convolutions with varied dilation rates, pooling, and pointwise convolutions to effectively model both fine‑grained and long‑range temporal dependencies. This multi‑branch design not only addresses diverse action speeds and durations but also maintains parameter efficiency. By integrating ASGM and TMGM, ASTM‑Net jointly captures spatial-temporal mutualities with significantly reduced computational cost. Extensive experiments on NTU‑RGB + D, NTU‑RGB + D 120, and Toyota Smarthome demonstrate ASTM‑Net's superiority: it outperforms DualHead‑Net‑ALLs by 0.31% on NTU‑RGB + D X‑Sub and surpasses SkateFormer by 2.22% on Toyota Smarthome Cross‑Subject; it reduces parameters by 51.9% and FLOPs by 49.7% compared to MST‑GCNN‑ALLs while improving accuracy by 0.82%; and under 30% random node occlusion, it achieves 86.94% accuracy-3.49% higher than CBAM‑STGCN.

Authors

Bin Wu

Department of Psychiatry, Xi'an Mental Health Center, Xi'an, China.
Mei Xue

Beijing Engineering Research Center of Diagnosis and Treatment of Respiratory and Critical Care Medicine, Beijing Chaoyang Hospital, Beijing 100043, China.
Ying Jia

Department of Pathology, The Fourth Hospital of Hebei Medical University, No. 12 Jiankang Road, Shijiazhuang, 050011, Hebei, China.
Ning Zhang

Institute of Nuclear Agricultural Sciences, Zhejiang University, Hangzhou, 310058, China.
Guojin Zhao

School of Electronic and Electrical Engineering, Wuhan Textile University, Wuhan 430200, China.
Xiuping Wang

Department of Neurology, the First Affiliated Hospital of Jiamusi University, Jiamusi 154000, Heilongjiang, China.
Chunlei Zhang

Center for Robust Speech Systems (CRSS), The University of Texas at Dallas, 800 West Campbell Road, Richardson, Texas 75080, USA.

Keywords

Algorithms Humans Neural Networks, Computer Skeleton Sports Video Recording

External Resources

View on PubMed Access via DOI PubMed (40627670)

Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals