Multimodal fusion based few-shot network intrusion detection system.
Journal:
Scientific reports
Published Date:
Jul 1, 2025
Abstract
As network environments become increasingly complex and new attack methods emerge more frequently, the diversity of network attacks continues to grow. Particularly with new or rare attacks, gathering a large number of labeled samples is extremely difficult, resulting in limited training data. Existing few-shot learning methods, while reducing reliance on large datasets, mostly handle single-modality data and fail to fully exploit complementary information across different modalities, limiting detection performance. To address this challenge, we introduce a multimodal fusion based few-shot network intrusion detection method that merges traffic feature graphs and network feature sets. Tailored to these modal characteristics, we develop two models: the G-Model and the S-Model. The G-Model employs convolutional neural networks to capture spatial connections in traffic feature graphs, while the S-Model uses the Transformer architecture to process and fuse network feature sets with long-range dependencies. Furthermore, we extensively study the fusion effects of these two modalities at various interaction depths to enhance detection performance. Experimental validation on the CICIDS2017 and CICIDS2018 datasets demonstrates that our method achieves multi-class accuracy rates of 93.40% and 98.50%, respectively, surpassing existing few-shot network intrusion detection methods.
Authors
Keywords
No keywords available for this article.