AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection
Journal:
arXiv
Published Date:
Apr 16, 2025
Abstract
Industrial Anomaly Detection (IAD) poses a formidable challenge due to the
scarcity of defective samples, making it imperative to deploy models capable of
robust generalization to detect unseen anomalies effectively. Traditional
approaches, often constrained by hand-crafted features or domain-specific
expert models, struggle to address this limitation, underscoring the need for a
paradigm shift. We introduce AnomalyR1, a pioneering framework that leverages
VLM-R1, a Multimodal Large Language Model (MLLM) renowned for its exceptional
generalization and interpretability, to revolutionize IAD. By integrating MLLM
with Group Relative Policy Optimization (GRPO), enhanced by our novel Reasoned
Outcome Alignment Metric (ROAM), AnomalyR1 achieves a fully end-to-end solution
that autonomously processes inputs of image and domain knowledge, reasons
through analysis, and generates precise anomaly localizations and masks. Based
on the latest multimodal IAD benchmark, our compact 3-billion-parameter model
outperforms existing methods, establishing state-of-the-art results. As MLLM
capabilities continue to advance, this study is the first to deliver an
end-to-end VLM-based IAD solution that demonstrates the transformative
potential of ROAM-enhanced GRPO, positioning our framework as a forward-looking
cornerstone for next-generation intelligent anomaly detection systems in
industrial applications with limited defective data.