SHIELD: Suppressing Hallucinations In LVLM Encoders via Bias and Vulnerability Defense
中文标题: SHIELD:通过偏差与脆弱性防御抑制 LVLM 编码器幻觉
英文摘要
This paper identifies that object hallucinations in large vision-language models (LVLMs) originate from visual encoders, uncovering three core issues: statistical bias, inherent bias, and vulnerability. To address these, SHIELD is introduced as a training-free framework that applies three strategies: re-weighting visual tokens to reduce statistical bias, injecting noise-derived tokens to counteract inherent bias, and employing adversarial attacks with contrastive decoding to mitigate vulnerability. Experiments across multiple benchmarks and LVLM families demonstrate SHIELD effectively reduces object hallucinations while maintaining strong general performance, and the code is publicly available.
中文摘要
该论文首次将大视觉语言模型(LVLM)中的物体幻觉溯源至视觉编码器,指出统计偏差、固有偏差和脆弱性三个关键问题。为解决这些问题,提出了无需训练的框架 SHIELD,通过三种策略缓解幻觉:重加权视觉令牌以降低统计偏差、引入噪声衍生令牌对抗固有偏差、采用对抗攻击与对比解码应对脆弱性。在多个基准和 LVLM 家族上的实验表明,SHIELD 有效减少了物体幻觉,并保持了强大的通用性能,代码已开源。
关键要点
First work to trace LVLM object hallucinations to visual encoders, identifying statistical bias, inherent bias, and vulnerability as root causes.
首个将 LVLM 物体幻觉溯源至视觉编码器的工作,揭示了统计偏差、固有偏差和脆弱性三大根源。
Proposes SHIELD, a training-free framework with three complementary strategies: visual token re-weighting, noise-derived token injection, and adversarial attacks with contrastive decoding.
提出了无需训练的框架 SHIELD,包含三种互补策略:视觉令牌重加权、噪声衍生令牌注入以及结合对比解码的对抗攻击。
SHIELD significantly reduces object hallucinations across diverse benchmarks and multiple LVLM families without fine-tuning the vision encoder or LLM.
SHIELD 在多个基准和多种 LVLM 家族上显著降低物体幻觉,无需对视觉编码器或大语言模型进行微调。
The method also maintains strong general LVLM performance, demonstrating its broad applicability beyond hallucination mitigation.
该方法在通用 LVLM 基准上同样保持强性能,显示出其超越幻觉缓解的广泛适用性。
Code is publicly available at https://github.com/hukcc/SHIELD.
代码已公开在 https://github.com/hukcc/SHIELD。