One-dimensional time-frequency dual-channel visual transformer for bearing fault diagnosis under strong noise and limited data conditions.
Journal:
Scientific reports
Published Date:
Jul 20, 2025
Abstract
In industrial settings, bearing health directly affects equipment stability, making accurate and efficient fault diagnosis critical for operational safety. Recently, Transformer models have been widely adopted in bearing fault diagnosis due to their strong global modeling capabilities. However, they still face significant challenges under strong noise and limited data. To address this, this paper proposes an end-to-end Vision Transformer with time-frequency fusion and dual attention across spatial and channel dimensions. The model adopts a dual-branch design: the time-domain branch incorporates spatial and channel attention to capture both local and global features, while the frequency-domain branch uses FFT to extract spectral information and fuses it with temporal features for efficient multi-scale modeling. To further enhance sensitivity to local patterns and periodic variations, a cross-scale convolution module and a periodic feedforward network are introduced. Experiments on the CWRU and PU datasets demonstrate that the proposed model achieves 99.42% and 98.14% accuracy, respectively, under noisy and data-scarce conditions. The results confirm superior noise robustness and diagnostic performance over recent state-of-the-art methods, highlighting its practical potential for real-world industrial applications.
Authors
Keywords
No keywords available for this article.