Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Journal:
arXiv
Published Date:
Apr 9, 2025
Abstract
Although multimodal large language models (MLLMs) exhibit remarkable
reasoning capabilities on complex multimodal understanding tasks, they still
suffer from the notorious hallucination issue: generating outputs misaligned
with obvious visual or factual evidence. Currently, training-based solutions,
like direct preference optimization (DPO), leverage paired preference data to
suppress hallucinations. However, they risk sacrificing general reasoning
capabilities due to the likelihood displacement. Meanwhile, training-free
solutions, like contrastive decoding, achieve this goal by subtracting the
estimated hallucination pattern from a distorted input. Yet, these handcrafted
perturbations (e.g., add noise to images) may poorly capture authentic
hallucination patterns. To avoid these weaknesses of existing methods, and
realize robust hallucination mitigation (i.e., maintaining general reasoning
performance), we propose a novel framework: Decoupling Contrastive Decoding
(DCD). Specifically, DCD decouples the learning of positive and negative
samples in preference datasets, and trains separate positive and negative image
projections within the MLLM. The negative projection implicitly models real
hallucination patterns, which enables vision-aware negative images in the
contrastive decoding inference stage. Our DCD alleviates likelihood
displacement by avoiding pairwise optimization and generalizes robustly without
handcrafted degradation. Extensive ablations across hallucination benchmarks
and general reasoning tasks demonstrate the effectiveness of DCD, i.e., it
matches DPO's hallucination suppression while preserving general capabilities
and outperforms the handcrafted contrastive decoding methods.