HGMamba: Enhancing 3D Human Pose Estimation with a HyperGCN-Mamba Network
Journal:
arXiv
Published Date:
Apr 9, 2025
Abstract
3D human pose lifting is a promising research area that leverages estimated
and ground-truth 2D human pose data for training. While existing approaches
primarily aim to enhance the performance of estimated 2D poses, they often
struggle when applied to ground-truth 2D pose data. We observe that achieving
accurate 3D pose reconstruction from ground-truth 2D poses requires precise
modeling of local pose structures, alongside the ability to extract robust
global spatio-temporal features. To address these challenges, we propose a
novel Hyper-GCN and Shuffle Mamba (HGMamba) block, which processes input data
through two parallel streams: Hyper-GCN and Shuffle-Mamba. The Hyper-GCN stream
models the human body structure as hypergraphs with varying levels of
granularity to effectively capture local joint dependencies. Meanwhile, the
Shuffle Mamba stream leverages a state space model to perform spatio-temporal
scanning across all joints, enabling the establishment of global dependencies.
By adaptively fusing these two representations, HGMamba achieves strong global
feature modeling while excelling at local structure modeling. We stack multiple
HGMamba blocks to create three variants of our model, allowing users to select
the most suitable configuration based on the desired speed-accuracy trade-off.
Extensive evaluations on the Human3.6M and MPI-INF-3DHP benchmark datasets
demonstrate the effectiveness of our approach. HGMamba-B achieves
state-of-the-art results, with P1 errors of 38.65 mm and 14.33 mm on the
respective datasets. Code and models are available:
https://github.com/HuCui2022/HGMamba