Highly Condensed All-MLP Architecture for Long-Term Human Motion Prediction.
Journal:
IEEE transactions on neural networks and learning systems
Published Date:
Jul 8, 2025
Abstract
In artificial intelligence (AI) scenarios where computational resources are constrained, such as in autonomous driving systems, it is challenging to construct a lightweight model that can accurately predict human motion overextended duration. To tackle this challenge, we introduce a highly condensed all-multilayer perceptron (HCMLP) architecture that is engineered for supreme lightweight efficiency. This design facilitates extended-range motion predictions while maintaining uncompromised performance. First, the spatiotemporal dynamic perception (STDP) block enhances operational efficiency while maintaining a simple structure. In STDP, the distinct but parallel spatial multilayer perceptron (SMLP) and temporal multilayer perceptron (TMLP) simultaneously capture the spatial correlations between pose joints and the temporal dynamics of each joint. The subsequent dynamic aggregation (DA), coupled with the channel multilayer perceptron (CMLP), dynamically consolidates and refines spatial and temporal features, leading to improved predictive accuracy. Second, the multiterm union prediction (MTUP) block directly delivers precise predictions for periods ranging from 0 to 4000 ms, eliminating the need for repetitive short-term (ST) prediction iterations. Our experimental results on the Human3.6M, AMASS, 3DPW, and CMU-Mocap datasets demonstrate that HCMLP outperforms existing state-of-the-art (SOTA) methods in ST prediction, long-term (LT) prediction, and especially in extended and extra extended LT (ELT) predictions, all while utilizing the fewest parameters.
Authors
Keywords
No keywords available for this article.