Safety at Scale: A Comprehensive Survey of Large Model Safety

Journal: arXiv

Published Date: Feb 2, 2025

Abstract

The rapid advancement of large models, driven by their exceptional abilities in learning and generalization through large-scale pre-training, has reshaped the landscape of Artificial Intelligence (AI). These models are now foundational to a wide range of applications, including conversational AI, recommendation systems, autonomous driving, content generation, medical diagnostics, and scientific discovery. However, their widespread deployment also exposes them to significant safety risks, raising concerns about robustness, reliability, and ethical implications. This survey provides a systematic review of current safety research on large models, covering Vision Foundation Models (VFMs), Large Language Models (LLMs), Vision-Language Pre-training (VLP) models, Vision-Language Models (VLMs), Diffusion Models (DMs), and large-model-based Agents. Our contributions are summarized as follows: (1) We present a comprehensive taxonomy of safety threats to these models, including adversarial attacks, data poisoning, backdoor attacks, jailbreak and prompt injection attacks, energy-latency attacks, data and model extraction attacks, and emerging agent-specific threats. (2) We review defense strategies proposed for each type of attacks if available and summarize the commonly used datasets and benchmarks for safety research. (3) Building on this, we identify and discuss the open challenges in large model safety, emphasizing the need for comprehensive safety evaluations, scalable and effective defense mechanisms, and sustainable data practices. More importantly, we highlight the necessity of collective efforts from the research community and international collaboration. Our work can serve as a useful reference for researchers and practitioners, fostering the ongoing development of comprehensive defense systems and platforms to safeguard AI models.

Authors

Xingjun Ma
Yifeng Gao
Yixu Wang
Ruofan Wang
Xin Wang
Ye Sun
Yifan Ding
Hengyuan Xu
Yunhao Chen
Yunhan Zhao
Hanxun Huang
Yige Li
Jiaming Zhang
Xiang Zheng
Yang Bai
Zuxuan Wu
Xipeng Qiu
Jingfeng Zhang
Yiming Li
Xudong Han
Haonan Li
Jun Sun
Cong Wang
Jindong Gu
Baoyuan Wu
Siheng Chen
Tianwei Zhang
Yang Liu
Mingming Gong
Tongliang Liu
Shirui Pan
Cihang Xie
Tianyu Pang
Yinpeng Dong
Ruoxi Jia
Yang Zhang
Shiqing Ma
Xiangyu Zhang
Neil Gong
Chaowei Xiao
Sarah Erfani
Tim Baldwin
Bo Li
Masashi Sugiyama
Dacheng Tao
James Bailey
Yu-Gang Jiang

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2502.05206v3)

Safety at Scale: A Comprehensive Survey of Large Model Safety

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Safety at Scale: A Comprehensive Survey of Large Model Safety

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals