RoGAtten: Rotary gated linear attention for multivariate time series forecasting.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

Thousands of network nodes in the Internet of Things produce vast amounts of long-term time series. Predicting network traffic helps identify security risks and improve network management. In the past few years, Transformer-based models (Transformers) achieve superior predicting accuracy. However, the attention mechanism faces the challenge of balancing the expressivity and computational efficiency. Recently, an effective state space model named Mamba has been proposed. It demonstrates exceptional capabilities for modeling long-term dependencies. Meanwhile, its gateing network structure also provides inspiration for enhancing the attention mechanism. In this paper, we theoretically prove that the linear attention with rotary positional embeddings can be rewritten to the form similar to Mamba. Building on this insight, we design a scalable rotary position embedding (SRoPE) mechanism that introduces a scaling factor to adjust information flow while retaining the relative positional relationships. This confers a forget-gate-like capability on the model and allows seamless integration with existing multi-head mechanisms, achieving greater expressiveness than previous attention variants. We then propose Rotary Gated linear Attention (RoGAtten) for multivariate time series forecasting. RoGAtten is employed to capture inter-series dependencies. The SRoPE can provide series-wise discriminative identifier and adjust the strength of interactions between variables, enabling predictions that better align with domain knowledge. Extensive experiments on 8 real-world datasets show that RoGAtten reduces MSE by 3.85% and MAE by 1.71% compared to the state-of-the-art methods.

Authors

Keywords

No keywords available for this article.