Robust Multimodal Large Language Models Against Modality Conflict

Robust Multimodal Large Language Models Against Modality Conflict [94.1]
マルチモーダル大言語モデル(MLLM)は、現実のシナリオにおいて幻覚を起こす傾向がある。我々は、MLLMをジレンマに配置し、幻覚に直接導く異なるモダリティからの入力における固有の矛盾について研究する。モダリティ衝突による幻覚を緩和する3つの方法が提案されている。
論文参考訳（メタデータ） (Wed, 09 Jul 2025 11:18:38 GMT)
MLLM特有のハルシネーション（モダリティ間の不整合に関連するもの）に対する対策の整理「Multimodal Modality Conflict (MMMC) 」というデータセットも作成し検証。検証の中ではプロンプトエンジニアリング、SFT、強化学習でのハルシネーション軽減を試し「Our results show that the reinforcement learning method achieves the best performance in mitigating the hallucination under modality conflict, while the supervised fine- tuning method shows promising and stable performance.」とのこと。
リポジトリはGitHub – zmzhang2000/MMMC: Official repository for Robust Multimodal Large Language Models Against Modality Conflict

コメントを残す

コメントを残す コメントをキャンセル