Design of an AI-based security anomaly detection system for IoT terminals based on the ViT-transformer fusion model.
Journal:
Scientific reports
Published Date:
Jun 2, 2026
Abstract
The deep penetration of IoT terminals in water systems, healthcare, transportation, and other fields has exacerbated security threats such as cyber-physical attacks and traffic anomalies. However, traditional anomaly detection methods have limitations such as dependence on labeled data, weak generalization ability, high resource consumption, and prominent privacy risks. Although Vision Transformer (ViT) has the advantage of capturing global features, it is difficult to directly adapt to resource-constrained IoT terminals. In existing research, hybrid deep learning models have improved detection accuracy, but lightweight ViT fusion models lack terminal adaptability and multi-modal data fusion applications are scarce. The balance between dynamic scheduling and privacy protection in end-edge-cloud collaboration still needs to be broken through. To address the above issues, this paper proposes an IoT terminal AI security anomaly detection system based on the ViT-Transformer fusion model: adopting a three-level end-edge-cloud collaborative architecture, integrating multi-modal data such as network traffic, sensor timing, and side channel signals, and achieving cross-modal feature fusion through tokenization; combining pruning, distillation, and quantization optimization strategies to increase the model compression ratio to 70%; introducing Elliptic Curve Certificateless Encryption (CL-PKE) and Batch Listing Signature (BLS) batch authentication to ensure data security, and using federated learning to aggregate edge model updates and optimize global performance. Experiments were conducted on public datasets such as IoT-23 and UCI, as well as a self-made testbed. The results show that the model achieves an accuracy of 89.2% and an F1-score of 0.87 in multi-modal anomaly detection, with a terminal inference delay of 90ms and a memory footprint of 30MB, adapting to low-computing devices such as RPi4B and Arduino; CL-PKE resists brute force attacks for 5.2e6 seconds, and batch authentication for 100 terminals takes only 75ms; it exhibits excellent generalization across smart home, industrial IoT, and other scenarios, with a defense success rate of 85.3% against FGSM attacks. This study effectively addresses the resource bottleneck and security pain points of existing methods, providing an efficient and reliable technical solution for IoT terminal security.
Authors
Keywords
No keywords available for this article.