Hallucination – arXiv最新論文の紹介

(Im)possibility of Automated Hallucination Detection in Large Language Models

(Im)possibility of Automated Hallucination Detection in Large Language Models [40.1]
大規模言語モデル(LLM)が生成する幻覚を自動的に検出する可能性を分析するための理論的枠組みを提案する。未知のターゲット言語から抽出された例に基づいて訓練されたアルゴリズムが、LLMの出力が正しいか、幻覚を構成するかを確実に判断できるかどうかを検討する。我々は、専門家ラベル付きフィードバックの使用、すなわち、正の例(誤記)と負の例(誤記)の両方で検出器を訓練することで、この結論を劇的に変えることを示した。
論文参考訳（メタデータ） (Wed, 23 Apr 2025 18:00:07 GMT)
ハルシネーションに関する報告で、「Automated detection of hallucinations by a detector that is trained only on correct examples (positive examples) is inherently difﬁcult and typically impossible without additional assumptions or signals.」、「Reliable automated hallucination detection is achievable when the detector is trained using both correct (positive) and explicitly labeled incorrect (negative) examples.」
論文中にも指摘のあるように「These ﬁndings underscore the critical role of human feedback in practical LLM training.」と今の構築過程と整合的（もっともhumanである必要性はあるのかはどうなるかわからないが・・・）

The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination

The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination [85.2]
本稿では,知識のシェードイングをモデル化することで,事実の幻覚を定量化する新しい枠組みを提案する。オーバシャドウ(27.9%)、MemoTrap(13.1%)、NQ-Swap(18.3%)のモデル事実性を顕著に向上させる。
論文参考訳（メタデータ） (Sat, 22 Feb 2025 08:36:06 GMT)
ハルシネーションの定量化とハルシネーションを抑えるデコード戦略「Contrastive Decoding to Amplify Overshadowed Knowledge (CoDA)」の提案。
「Our work identify knowledge overshadowing as a contributional cause of LLMs hallucination, where dominant knowledge suppresses less frequent facts, leading to fact distortions.」は直観・経験的に違和感はなく、実験結果も面白い。

Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking

Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking [124.7]
HaluSearchは、ツリー検索ベースのアルゴリズムを組み込んだ新しいフレームワークである。テキスト生成をステップバイステップの推論プロセスとしてフレーム化する。認知科学における二重プロセス理論に着想を得た階層的思考システムスイッチ機構を導入する。
論文参考訳（メタデータ） (Thu, 02 Jan 2025 15:36:50 GMT)
「We propose HaluSearch, which integrates tree search-based algorithms (e g , MCTS) to explicitly implement a slow thinking process during the inference stage of LLMs, fully exploiting their own internal knowledge to mitigate hallucinations in generated text.」、各ステップの報酬を評価するスタイル。「To facilitate self-evaluation, we trained the reward model using data synthesized by the HaluSearch framework to assess the degree of hallucinations and provide reward signals.」とのこと。「Additionally, to improve efficiency, we introduced a dynamic system switch mechanism, which utilizes a trained switch model to enable LLMs to adaptively alternate between fast and slow thinking modes at both the instance and step levels.」という機構を有することが特徴的で、overthinking対策としても有望そうな感じがする。
現時点での全部入り的なアプローチで面白い。

Combating Multimodal LLM Hallucination via Bottom-up Holistic Reasoning

Combating Multimodal LLM Hallucination via Bottom-up Holistic Reasoning [151.4]
マルチモーダル大規模言語モデル(MLLM)は、視覚言語タスクを前進させる前例のない能力を示した。本稿では,MLLMにおける幻覚に対処するためのボトムアップ推論フレームワークを提案する。本フレームワークは、認識レベル情報と認知レベルコモンセンス知識を検証・統合することにより、視覚とテキストの両方の入力における潜在的な問題に体系的に対処する。
論文参考訳（メタデータ） (Sun, 15 Dec 2024 09:10:46 GMT)
MLLM、VQAタスクを対象としたハルシネーション対策、1. Target Identification and Visual Perception, 2. Visual Perception Verification, 3. Question Validation and Adjustment, 4. Commonsense Induction, 5. Commonsense Verification, 6. Question answeringというモジュールで構成。

DecoPrompt

DecoPrompt : Decoding Prompts Reduces Hallucinations when Large Language Models Meet False Premises [28.7]
幻覚を緩和する新しいプロンプトアルゴリズムDecoPromptを提案する。 DecoPrompt は LLM を利用して偽前提のプロンプトを “デコード” する。 2つのデータセットで実験を行い、DecoPromptは異なるLLMから出力された幻覚を効果的に低減できることを示した。
論文参考訳（メタデータ） (Tue, 12 Nov 2024 00:48:01 GMT)
「Inspired by the observation that entropy of the false-premise prompt is closely related to its likelihood to elicit hallucination generation, we propose a new prompting algorithm, named DecoPrompt, to mitigate hallucination.」をうけて「1) first paraphrases the user’s prompt to obtain several semantically similar candidates, then 2) decodes them with the LLM, and 3) selects the lowest-entropy candidate as the new prompt.」という手法の提案。シンプルな手法に見えるが、効果があるのは興味深い。
リポジトリはGitHub – xunannancy/DecoPrompt: Code for paper DecoPrompt : Decoding Prompts Reduces Hallucinations when Large Language Models Meet False Premises

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector [114.9]
幻覚検出は大規模言語モデル(LLM)にとって難しい課題である本稿では,HluAgentと呼ばれる自律型LLMエージェントフレームワークを提案する。 HaluAgentでは、LLM、多機能ツールボックスを統合し、きめ細かい3段階検出フレームワークを設計する。
論文参考訳（メタデータ） (Mon, 17 Jun 2024 07:30:05 GMT)
7B, 13Bと小型のLLMをfine tuneし優れた性能をもつハルシネーション検知エージェントの提案。複数のツール(検索エンジンやコード実行環境など)を使い分けるアプローチでfine tuning用データはGPT-4から得ている。
（GPT-4だとライセンス上の問題があるが）Nemotronなどこのアプローチをとっても問題ないLLMが出てきており本手法は有望そうに思える。
リポジトリはGitHub – RUCAIBox/HaluAgent

Hallucination of Multimodal Large Language Models: A Survey

Hallucination of Multimodal Large Language Models: A Survey [40.7]
マルチモーダル大規模言語モデル(MLLM)は,多モーダルタスクにおいて顕著な進歩と顕著な能力を示した。これらの有望な発展にもかかわらず、MLLMは視覚的内容と矛盾する出力をしばしば生成する。本調査は,MLLMにおける幻覚の理解を深め,この分野のさらなる進歩を促すことを目的としている。
論文参考訳（メタデータ） (Mon, 29 Apr 2024 17:59:41 GMT)
マルチモーダルなLLMを対象としたハルシネーションのサーベイ、最新動向を整理するのに有用。
論文リポジトリもある　GitHub – showlab/Awesome-MLLM-Hallucination: 📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

SAFE: Search-Augmented Factuality Evaluator

Long-form factuality in large language models [59.3]
大規模言語モデル(LLM)は、しばしば、オープンエンドトピックの事実検索プロンプトに応答するときに、事実エラーを含むコンテンツを生成する。まず最初にGPT-4を用いて、38のトピックにまたがる何千もの質問からなるプロンプトセットであるLongFactを生成します。そこで我々は,LLMエージェントを検索拡張現実性評価器 (SAFE) と呼ぶ手法により,長期的事実性の自動評価器として使用できることを提案する。
論文参考訳（メタデータ） (Wed, 27 Mar 2024 17:48:55 GMT)
事実性の間違いを重視したベンチマーク、「SAFE utilizes an LLM to break down a long-form response into a set of individual facts and to evaluate the accuracy of each fact using a multi-step reasoning process comprising sending search queries to Google Search and determining whether a fact is supported by the search results.」「Empirically, we demonstrated that SAFE achieves superhuman performance by agreeing with 72% of human annotations and winning 76% of examples out of a set of 100 randomly-sampled disagreement cases.」とのこと。ベンチマークとしての評価結果はGPT-4-turbo > Gemini Ultra > Calude-3 OPUSでClaude 3 OPUSはハルシネーションが多いのでは？という印象を裏付けていそうに思う。SAFEは評価用だけでなく二次チェックにも有用そう。
リポジトリはgoogle-deepmind/long-form-factuality: Benchmarking long-form factuality in large language models. Original code for our paper “Long-form factuality in large language models.” (github.com)

Fine-grained Hallucination Detection and Editing for Language Models

Fine-grained Hallucination Detection and Editing for Language Models [114.3]
大規模言語モデル(LM)は、多種多様な事実的不正確な文を生成する傾向にあり、幻覚と呼ばれる。現在のアプローチは主に、粗い粒度の自動幻覚検出や編集に重点を置いており、微妙なエラーレベルを見下ろしている。そこで本研究では、6つの階層的に定義された幻覚を包含する分類法を提案する。
論文参考訳（メタデータ） (Fri, 12 Jan 2024 19:02:48 GMT)
Hallucinationを6カテゴリに分け、ベンチマークを構築、検出方法としてFAVA (FAct Vericaton with Augmentation)を提案。「ChatGPT (gpt-3.5-turbo-0301) with a carefully designed prompt describing all six categories with two demonstrations.」や左記＋Contriever のベースラインに比べて高い性能とのこと。
プロジェクトサイトはFine-grained Hallucination Detection and Editing For Language Models (fine-grained-hallucination.github.io)

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models [7.7]
大きな言語モデル(LLM)は、人間のようなテキストを書く能力の進歩を続けている。重要な課題は、事実に見えるが根拠のないコンテンツを生み出すことを幻覚させる傾向にある。本稿では,LLMにおける幻覚を緩和するために開発された32以上の技術について調査する。
論文参考訳（メタデータ） (Tue, 2 Jan 2024 17:56:30 GMT)
ハルシネーション対策手法のサーベイ
色々出てはいるが実装時に使えるもの使えないものがあり、効果も様々。言語影響が大きいものもあってなかなか決定版はない印象。

2025年6月
月	火	水	木	金	土	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30