GTIGNet: Global Topology Interaction Graphormer Network for 3D hand pose estimation.

Journal: Neural networks : the official journal of the International Neural Network Society

Published Date: Feb 4, 2025

Abstract

Estimating 3D hand poses from monocular RGB images presents a series of challenges, including complex hand structures, self-occlusions, and depth ambiguities. Existing methods often fall short of capturing the long-distance dependencies of skeletal and non-skeletal connections for hand joints. To address these limitations, we introduce the Global Topology Interaction Graphormer Network (GTIGNet), a novel deep learning architecture designed to improve 3D hand pose estimation. Our model incorporates a Context-Aware Attention Block (CAAB) within the 2D pose estimator to enhance the extraction of multi-scale features, yielding more accurate 2D joint heatmaps to support the task that followed. Additionally, we introduce a High-Order Graphormer that explicitly and implicitly models the topological structure of hand joints, thereby enhancing feature interaction. Ablation studies confirm the effectiveness of our approach, and experimental results on four challenging datasets, Rendered Hand Dataset (RHD), Stereo Hand Pose Benchmark (STB), First-Person Hand Action Benchmark (FPHA), and FreiHAND Dataset, indicate that GTIGNet achieves state-of-the-art performance in 3D hand pose estimation. Notably, our model achieves an impressive Mean Per Joint Position Error (MPJPE) of 9.98 mm on RHD, 6.12 mm on STB, 11.15 mm on FPHA and 10.97 mm on FreiHAND.

Authors

Yanjun Liu
Wanshu Fan

National and Local Joint Engineering Laboratory of Computer Aided Design, School of Software Engineering, Dalian University, China. Electronic address: fanwanshu@dlu.edu.cn.
Cong Wang

Department of Vascular Surgery, Xuanwu Hospital, Capital Medical University, Beijing, China.
Shixi Wen

School of Information and Engineering, Dalian University, China. Electronic address: 05423229@163.com.
Xin Yang

Department of Oral Maxillofacial-Head Neck Oncology, Ninth People's Hospital, College of Stomatology, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Oral Diseases, Shanghai Key Laboratory of Stomatology & Shanghai Research Institute of Stomatology, Shanghai, China.
Qiang Zhang

Yunan Provincial Center for Disease Control and Prevention, Kunming 650022, China.
Xiaopeng Wei

Key Lab of Advanced Design and Intelligent Computing (Ministry of Education), Dalian University, Dalian, China.
Dongsheng Zhou

National and Local Joint Engineering Laboratory of Computer Aided Design, School of Software Engineering, Dalian University, China; School of Computer Science and Technology, Dalian University of Technology, China. Electronic address: zhouds@dlu.edu.cn.

Keywords

Deep Learning Hand Humans Imaging, Three-Dimensional Neural Networks, Computer

External Resources

View on PubMed Access via DOI PubMed (39922160)

GTIGNet: Global Topology Interaction Graphormer Network for 3D hand pose estimation.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals