RAG – ページ 3 – arXiv最新論文の紹介

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning [45.7]
本稿では,大規模言語モデル(LLM)のためのDeepSeek-R1モデルの拡張であるSearch-R1を紹介する。 Search-R1は、リアルタイム検索によるステップバイステップ推論中に(複数の)検索クエリを自律的に生成する。実験の結果、サーチ-R1は26%(Qwen2.5-7B)、21%(Qwen2.5-3B)、10%(LLaMA3.2-3B)のSOTAベースラインの性能向上を示した。
論文参考訳（メタデータ） (Wed, 12 Mar 2025 16:26:39 GMT)
検索クエリを発行しながら推論を進めるフレームワークの提案「SEARCH-R1, a novel reinforcement learning framework that enables large language models (LLMs) to interleave self-reasoning with real-time search engine interactions.」。
リポジトリはGitHub – PeterGriffinJin/Search-R1: Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking [96.9]
我々は,工学的問題に対する完全かつ実現可能なソリューションを生成するシステムの能力を評価するために,新しいベンチマークであるSolutionBenchを導入する。本稿では,木に基づく探索と二点思考機構を利用して信頼性の高いソリューションを生成する新しいシステムであるSolutionRAGを提案する。
論文参考訳（メタデータ） (Fri, 28 Feb 2025 05:23:10 GMT)
工学の問題に対するソリューションを生成するベンチマークSolutionBenchと、それを解く手法SolutionRAGの提案。RAGとあるが「 SolutionRAG employs a bi-point thinking approach, alternating between solution design and review, gradually enhancing the solution’s completeness and reliability.」というツリーを作りながらの探索でAgenticなアプローチ。
リポジトリはGitHub – Li-Z-Q/DeepSolution: DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models

Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models [68.9]
Retrieval-Augmented Generation (RAG) は、Large Language Models (LLM) に対する幻覚を緩和する効果を証明している。既存の自動評価メトリクスは、トレーニングと評価の間にRAGモデルによって生成されたアウトプットを正確に評価することはできない。本稿では,RAGモデルのより正確な評価を実現するため,LCMの強化を目的とした判断一貫性(ConsJudge)手法を提案する。
論文参考訳（メタデータ） (Wed, 26 Feb 2025 04:50:43 GMT)
RAGを対象とした評価手法、「 Judge-Consistency (ConsJudge), a method that enhances LLM-based judgment models to generate more accurate evaluations for RAG models in a self-improvement framework.」の提案。
リポジトリはGitHub – OpenBMB/ConsJudge

HippoRAG2, RAG vs Graph RAG, A-MEM: Agentic Memory for LLM Agents

xRAG、FlashRAG、HippoRAG – arXiv最新論文の紹介の改善や、RAGとGraphRAGとの比較、AgenticなアプローチなどRAGやメモリ強化関連の研究は盛ん。得意領域が異なるのでハイブリッド化する動きが多く、また、Agenticに対応していくアプローチも多い印象。

From RAG to Memory: Non-Parametric Continual Learning for Large Language Models [6.4]
検索強化世代(RAG)は、新しい情報を導入する主要な方法となっている。最近のRAGは、知識グラフのような様々な構造を持つベクトル埋め込みを拡大して、いくつかのギャップ、すなわちセンスメイキングと連想性に対処している。我々は,現実的,感覚的,連想的なメモリタスクにおいて,標準RAGを総合的に上回るフレームワークであるHippoRAG 2を提案する。
論文参考訳（メタデータ） (Thu, 20 Feb 2025 18:26:02 GMT)
RAG&GraphRAGのハイブリッドアプローチ
リポジトリはGitHub – OSU-NLP-Group/HippoRAG: [NeurIPS’24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + Personalized PageRank.

RAG vs. GraphRAG: A Systematic Evaluation and Key Insights [42.3]
我々は,テキストベースベンチマークを用いて,検索型拡張生成(RAG)とグラフRAGを体系的に評価する。本結果は,RAGとGraphRAGの異なる課題と評価の観点から,それぞれ異なる強みを浮き彫りにしている。
論文参考訳（メタデータ） (Mon, 17 Feb 2025 02:36:30 GMT)
通常のRAGとGraphRAGの詳細な比較
「Community-based GraphRAG with Global Search focuses more on the global aspects of whole corpus, whereas RAG captures more detailed information.」とのこと

A-MEM: Agentic Memory for LLM Agents [42.5]
大規模言語モデル(LLM)エージェントは、歴史的経験を活用するためにメモリシステムを必要とする。現在のメモリシステムは基本的なストレージと検索を可能にするが、洗練されたメモリ構造は欠如している。本稿では, LLMエージェントに対して, エージェント方式で動的に記憶を整理できる新しいエージェントメモリシステムを提案する。
論文参考訳（メタデータ） (Mon, 17 Feb 2025 18:36:14 GMT)
Agenticなデータの保持。「1) Link Generation – automatically establishing connections between memories by identifying shared attributes and similar contextual descriptions, and (2) Memory Evolutionenabling existing memories to dynamically evolve as new experiences are analyzed, leading to the emergence of higher-order patterns and attributes.」とのことで、下記のように動作するとのこと。
- Generates comprehensive notes with structured attributes
- Creates contextual descriptions and tags
- Analyzes historical memories for relevant connections
- Establishes meaningful links based on similarities
- Enables dynamic memory evolution and updates
リポジトリはGitHub – WujiangXu/AgenticMemory

Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey

Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey [92.4]
Retrieval-Augmented Generation (RAG)は、AIGC(AIGC)の課題に対処するために設計された高度な技術である。 RAGは信頼性と最新の外部知識を提供し、幻覚を減らし、幅広いタスクで関連するコンテキストを保証する。 RAGの成功と可能性にもかかわらず、最近の研究により、RAGパラダイムはプライバシーの懸念、敵対的攻撃、説明責任の問題など、新たなリスクももたらしていることが示されている。
論文参考訳（メタデータ） (Sat, 08 Feb 2025 06:50:47 GMT)
RAG、Trustworthyのサーベイ。実用上様々な考慮点があるとはいえ、この観点でサーベイが必要な状況に若干驚き。
リポジトリはGitHub – Arstanley/Awesome-Trustworthy-Retrieval-Augmented-Generation、論文リストが公開されている。

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models [92.9]
我々はマルコフ決定過程(MDP)として検索強化推論をモデル化するDeepRAGを提案する。クエリを反復的に分解することで、DeepRAGは外部知識を取得するか、あるいは各ステップでパラメトリック推論に依存するかを動的に決定する。実験の結果、DeepRAGは解答精度を21.99%向上させ、検索強化推論の最適化の有効性を示した。
論文参考訳（メタデータ） (Mon, 03 Feb 2025 08:22:45 GMT)
「(1) Binary Tree Search, (2) Imitation Learning, and (3) Chain of Calibration.」とかなり凝ったRAG。精度向上に効果があるのはそうだろうと思うが・・・。

Parametric Retrieval Augmented Generation

Parametric Retrieval Augmented Generation [32.3]
Parametric RAGは、外部知識を直接フィードフォワードネットワークのパラメータに統合する新しいRAGパラダイムである。これは、大きな言語モデルにおける知識増強の有効性と効率を大幅に向上させる。
論文参考訳（メタデータ） (Mon, 27 Jan 2025 10:04:49 GMT)
「we propose to insert documents directly into the parameters of L. To achieve this, the Parametric RAG framework is designed with two stages: an offline document parameterization stage and an online inference stage with a Retrieve-Update-Generate workflow.」（LはLLMのパラメータ）という方式のRAG?の提案。LoRA をつかってなお計算は大変そうだが、性能はよさそうに見える。
リポジトリはGitHub – oneal2000/PRAG: Code for Parametric Retrieval Augmented Generation

Long Context vs. RAG for LLMs: An Evaluation and Revisits

Long Context vs. RAG for LLMs: An Evaluation and Revisits [41.3]
本稿は、このトピックに関する最近の研究を再考し、その重要な洞察と相違点を明らかにする。 LCは、特にウィキペディアベースの質問に対して、質問応答ベンチマークにおいてRAGよりも優れていた。また,既存の研究における文脈関連性の重要性を概観する,詳細な議論もおこなう。
論文参考訳（メタデータ） (Fri, 27 Dec 2024 14:34:37 GMT)
Revisiting In-Context Learning with Long Context Language Models – arXiv最新論文の紹介に近いが、Long Context vs RAGの検証。「The results indicate that LC generally outperforms RAG for tasks involving wellstructured, dense contexts—such as Wikipedia articles and books—and is better at answering questions requiring specific information.　By contrast, RAG demonstrates advantages in handling fragmented information, particularly in dialogue-based scenarios and for more general questions.」と一長一短。
これでOKと断言しにくい結果ではあるが、幅広い検証がとても参考になる。
リポジトリはGitHub – lixinze777/LC_VS_RAG: Offcial Page for Long Context vs. RAG for LLMs: An Evaluation and Revisits

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Search-o1: Agentic Search-Enhanced Large Reasoning Models [24.2]
OpenAI-o1のような大きな推論モデル(LRM)は、大規模な強化学習を通じて、大きなステップワイズ推論能力を実証している。エージェント検索拡張生成(RAG)機構とReason-in-Documentsモジュールを併用し,LRMを強化するフレームワークである textbfSearch-o1 を紹介する。
論文参考訳（メタデータ） (Thu, 09 Jan 2025 16:48:17 GMT)
RAG + Large Rrasoning Modelなフレームワークの提案。Agenticなアプローチに見えなくもないが、「(a) Direct reasoning without retrieval often results in inaccuracies due to missing knowledge. (b) Our agentic retrieval-augmented reasoning approach improves knowledge access but usually returns lengthy, redundant documents, disrupting coherent reasoning. (c) Our Search-o1 integrates concise and accurate retrieved knowledge seamlessly into the reasoning process, enabling precise and coherent problem-solving.」とReason-in-Documentsを用いLRMと別の処理として推論の流れに沿った情報を選択・要約してLRMに組み込む有効性を主張している。
リポジトリはSearch-o1: Agentic Search-Enhanced Large Reasoning Models

タグ: RAG

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models

HippoRAG2, RAG vs Graph RAG, A-MEM: Agentic Memory for LLM Agents

Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

Parametric Retrieval Augmented Generation

Long Context vs. RAG for LLMs: An Evaluation and Revisits

Search-o1: Agentic Search-Enhanced Large Reasoning Models

2026年3月
月	火	水	木	金	土	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31