RAG – ページ 5 – arXiv最新論文の紹介

Trustworthiness in Retrieval-Augmented Generation Systems: A Survey

Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.3]
Retrieval-Augmented Generation (RAG)は、大規模言語モデル(LLM)の開発において、急速に重要なパラダイムへと成長してきた。本稿では,RAGシステムの信頼性を,事実性,堅牢性,公正性,透明性,説明責任,プライバシの6つの面で評価する統一的な枠組みを提案する。
論文参考訳（メタデータ） (Mon, 16 Sep 2024 09:06:44 GMT)
信頼できるAIに関するサーベイはよくあるがRAGを対象としたものは珍しいように思う。
リポジトリはGitHub – smallporridge/TrustworthyRAG

P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task

P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task [94.1]
Embodied Everyday Taskは、インボディードAIコミュニティで人気のあるタスクである。自然言語命令は明示的なタスクプランニングを欠くことが多い。タスク環境に関する知識をモデルに組み込むには、広範囲なトレーニングが必要である。
論文参考訳（メタデータ） (Tue, 17 Sep 2024 15:29:34 GMT)
自然言語の指示と環境情報が与えられた時のエージェント動作（計画など）にRAGを使うアプローチの提案。RAGのデータベースを動的に更新していくものでLLM based Agentsそのものの印象。
感覚的にRetrieveに難しさがありそうだが、「When an agent interacts with the environment during a task, it first receives the environment’s goal instruction 𝐼𝑔 and observation 𝑂𝑡. Then it encodes with MiniLM [31] both of them」とあるがこの方針でうまくいくのかという驚き。

Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking

Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking [6.9]
我々は、GenAIモデルをジェイルブレイクする能力により、攻撃者はRAGベースのアプリケーションに対する攻撃の結果をエスカレートできることを示した。論文の前半では、攻撃者がRAG文書抽出攻撃に対してRAGメンバシップ推論攻撃をエスカレートできることが示されている。論文の第2部では、攻撃者がRAGデータ中毒攻撃の規模を拡大し、単一のアプリケーションに妥協することで、GenAIエコシステム全体を妥協できることを示す。
論文参考訳（メタデータ） (Thu, 12 Sep 2024 13:50:22 GMT)
RAGに対する攻撃、RAG membership inference attacks、RAG entity extraction attacksからRAG documents extraction attacksへ。
「Adversarial Self-Replicating Prompts」の考え方が面白い。
リポジトリはGitHub – StavC/UnleashingWorms-ExtractingData: Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking

Data Gemma

Googleから発表されたDataGemmaも興味深い取り組み（DataGemma: AI open models connecting LLMs to Google’s Data Commons (blog.google)、Grounding AI in reality with a little help from Data Commons (research.google)）である。

Home – Data Commonsを利用してハルシネーションを抑えようというものでRIG (Retrieval-Interleaved Generation) とRAG (Retrieval-Augmented Generation) のユースケースを想定。モデルはgoogle/datagemma-rig-27b-it · Hugging Face、google/datagemma-rag-27b-it · Hugging Faceに公開れている。

上記モデルはRIGであれば「The DataGemma model (based on the 27 billion parameter Gemma 2 model and fully fine-tuned for this RIG task) generates a response, which includes a natural language query for Data Commons’ existing natural language interface, specifically designed to retrieve relevant data. For example, instead of stating “The population of California is 39 million”, the model would produce “The population of California is [DC(What is the population of California?) → “39 million”]”, allowing for external verification and increased accuracy.」、RAGであれば「The DataGemma model (based on the Gemma 2 (27B) model and fully fine-tuned for this RAG task) analyzes the user’s query and generates a corresponding query (or queries) in natural language that can be understood by Data Commons’ existing natural language interface.」とのことでData Commonsの既存インタフェースをうまく活用できるようになっている。

この手のfine tuningは重要になりつつあるように思う。

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation [54.7]
大きな言語モデル(LLM)は対話、推論、知識保持における人間レベルの能力を示す。現在の研究は、LLMに外部知識を組み込むことによって、このボトルネックに対処している。 RAGLABはモジュール的で研究指向のオープンソースライブラリで、6つの既存のアルゴリズムを再現し、RAGアルゴリズムを調査するための包括的なエコシステムを提供する。
論文参考訳（メタデータ） (Wed, 21 Aug 2024 07:20:48 GMT)
RAGに関するモジュール型フレームワーク、「open-source tools such as LlamaIndex and LangChain employ high-level abstractions, which results in a lack of transparency and limits the ability to develop novel algorithms and evaluation metrics.」とあるが、実利用でも抽象化しすぎて使いにくいことは多い印象…
リポジトリはGitHub – fate-ubw/RAGLAB: RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

EfficientRAG: Efficient Retriever for Multi-Hop Question Answering

EfficientRAG: Efficient Retriever for Multi-Hop Question Answering [52.6]
マルチホップ質問応答のための効率的な検索器であるEfficientRAGを紹介する。実験の結果、EfficientRAGは3つのオープンドメインのマルチホップ質問応答データセット上で既存のRAG手法を超越していることがわかった。
論文参考訳（メタデータ） (Thu, 08 Aug 2024 06:57:49 GMT)
LLM callを抑えるためLabeler & Tagger、FIlterのモデルを使うタイプのRAG、合成データをうまく使ってトレーニングするアプローチ

RAG Foundry

RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation [8.4]
我々は、RAGのユースケースのための大規模言語モデルを拡張するためのオープンソースのフレームワークであるRAG Foundryを紹介します。 RAG Foundryはデータ生成、トレーニング、推論、評価を単一のワークフローに統合する。多様なRAG構成を持つLlama-3およびPhi-3モデルを拡張し,微調整することで,フレームワークの有効性を示す。
論文参考訳（メタデータ） (Mon, 05 Aug 2024 15:16:24 GMT)
「an open-source library dedicated to the task of RAG-augmentation of LLMs, namely fine-tuning LLMs to become better at RAG settings.」のためのフレームワーク。
リポジトリはGitHub – IntelLabs/RAGFoundry: Framework for specializing LLMs for retrieval-augmented-generation tasks using fine-tuning.

RAGEval

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [69.5]
既存のRAGベンチマークは主に、大言語モデルが一般的な知識に正しく答えられるかどうかを評価することに焦点を当てている。本稿では,評価データセットを自動生成するフレームワークであるRAGEvalを紹介する。 LLMが生み出す応答を慎重に評価するために, 完全性, 幻覚, 不適切性の3つの新しい指標を提案する。
論文参考訳（メタデータ） (Fri, 02 Aug 2024 13:35:11 GMT)
RAGを評価するベンチマークの自動生成フレームワーク。DRAGONBall dataset（Diverse RAG Omni-Benchmark for All domains）って・・・。
分析結果から見えるGenerate、Retrieverそれぞれのモデルの性能が興味深い。結論には「Notably, while GPT-4o showed superior performance overall, the gap with top-performing open-source models was relatively small.」という指摘も。

Retrieval-Augmented Generation for Natural Language Processing: A Survey

Retrieval-Augmented Generation for Natural Language Processing: A Survey [25.1]
検索強化生成(RAG)は、外部知識データベースを利用して大きな言語モデルを拡張する。本稿では,RAGの重要技術,特に検索器と検索融合について概説する。 RAGは、自然言語処理のタスクや産業シナリオで使われる。
論文参考訳（メタデータ） (Thu, 18 Jul 2024 06:06:53 GMT)
実用上重要なRAGのサーベイ。
構成要素の選択肢が多く、整理された情報はとてもありがたい。

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems [124.8]
我々は、文書のHaystackを合成する手順を設計し、特定のテキストが文書間で繰り返されることを保証します。すると、”Summary of a Haystack”(SummHay)タスクは、Haystackを処理し、クエリ、関連する洞察を特定し、ソースドキュメントを正確に引用する要約を生成するシステムを必要とする。
論文参考訳（メタデータ） (Mon, 01 Jul 2024 15:23:42 GMT)
長文・大量の文書を要約できるかに関する（合成データによる）SummHay ベンチマークを構築、様々なLLM及びRAGを比較した論文。「achieving strong coverage of key insights in a large corpus of text does not require retrieval, given a sufficiently capable long-context LLM.」、「for use-cases where citation quality is important, optimizing retrieval is paramount: it removes irrelevant documents from the summarizer’s context, narrowing and focusing options for citation.」とユースケースによってRAGの有効性が変わるよう。Gemini 1.5 ProはRAGなしでも相当有効に機能しているようなことも興味深い。Retrieveの戦略も複数比較されており参考になる。
リポジトリはGitHub – salesforce/summary-of-a-haystack: Codebase accompanying the Summary of a Haystack paper.

2026年3月
月	火	水	木	金	土	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31