RAG – ページ 4 – arXiv最新論文の紹介

Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking

Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking [6.9]
我々は、GenAIモデルをジェイルブレイクする能力により、攻撃者はRAGベースのアプリケーションに対する攻撃の結果をエスカレートできることを示した。論文の前半では、攻撃者がRAG文書抽出攻撃に対してRAGメンバシップ推論攻撃をエスカレートできることが示されている。論文の第2部では、攻撃者がRAGデータ中毒攻撃の規模を拡大し、単一のアプリケーションに妥協することで、GenAIエコシステム全体を妥協できることを示す。
論文参考訳（メタデータ） (Thu, 12 Sep 2024 13:50:22 GMT)
RAGに対する攻撃、RAG membership inference attacks、RAG entity extraction attacksからRAG documents extraction attacksへ。
「Adversarial Self-Replicating Prompts」の考え方が面白い。
リポジトリはGitHub – StavC/UnleashingWorms-ExtractingData: Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking

Data Gemma

Googleから発表されたDataGemmaも興味深い取り組み（DataGemma: AI open models connecting LLMs to Google’s Data Commons (blog.google)、Grounding AI in reality with a little help from Data Commons (research.google)）である。

Home – Data Commonsを利用してハルシネーションを抑えようというものでRIG (Retrieval-Interleaved Generation) とRAG (Retrieval-Augmented Generation) のユースケースを想定。モデルはgoogle/datagemma-rig-27b-it · Hugging Face、google/datagemma-rag-27b-it · Hugging Faceに公開れている。

上記モデルはRIGであれば「The DataGemma model (based on the 27 billion parameter Gemma 2 model and fully fine-tuned for this RIG task) generates a response, which includes a natural language query for Data Commons’ existing natural language interface, specifically designed to retrieve relevant data. For example, instead of stating “The population of California is 39 million”, the model would produce “The population of California is [DC(What is the population of California?) → “39 million”]”, allowing for external verification and increased accuracy.」、RAGであれば「The DataGemma model (based on the Gemma 2 (27B) model and fully fine-tuned for this RAG task) analyzes the user’s query and generates a corresponding query (or queries) in natural language that can be understood by Data Commons’ existing natural language interface.」とのことでData Commonsの既存インタフェースをうまく活用できるようになっている。

この手のfine tuningは重要になりつつあるように思う。

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation [54.7]
大きな言語モデル(LLM)は対話、推論、知識保持における人間レベルの能力を示す。現在の研究は、LLMに外部知識を組み込むことによって、このボトルネックに対処している。 RAGLABはモジュール的で研究指向のオープンソースライブラリで、6つの既存のアルゴリズムを再現し、RAGアルゴリズムを調査するための包括的なエコシステムを提供する。
論文参考訳（メタデータ） (Wed, 21 Aug 2024 07:20:48 GMT)
RAGに関するモジュール型フレームワーク、「open-source tools such as LlamaIndex and LangChain employ high-level abstractions, which results in a lack of transparency and limits the ability to develop novel algorithms and evaluation metrics.」とあるが、実利用でも抽象化しすぎて使いにくいことは多い印象…
リポジトリはGitHub – fate-ubw/RAGLAB: RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

EfficientRAG: Efficient Retriever for Multi-Hop Question Answering

EfficientRAG: Efficient Retriever for Multi-Hop Question Answering [52.6]
マルチホップ質問応答のための効率的な検索器であるEfficientRAGを紹介する。実験の結果、EfficientRAGは3つのオープンドメインのマルチホップ質問応答データセット上で既存のRAG手法を超越していることがわかった。
論文参考訳（メタデータ） (Thu, 08 Aug 2024 06:57:49 GMT)
LLM callを抑えるためLabeler & Tagger、FIlterのモデルを使うタイプのRAG、合成データをうまく使ってトレーニングするアプローチ

RAG Foundry

RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation [8.4]
我々は、RAGのユースケースのための大規模言語モデルを拡張するためのオープンソースのフレームワークであるRAG Foundryを紹介します。 RAG Foundryはデータ生成、トレーニング、推論、評価を単一のワークフローに統合する。多様なRAG構成を持つLlama-3およびPhi-3モデルを拡張し,微調整することで,フレームワークの有効性を示す。
論文参考訳（メタデータ） (Mon, 05 Aug 2024 15:16:24 GMT)
「an open-source library dedicated to the task of RAG-augmentation of LLMs, namely fine-tuning LLMs to become better at RAG settings.」のためのフレームワーク。
リポジトリはGitHub – IntelLabs/RAGFoundry: Framework for specializing LLMs for retrieval-augmented-generation tasks using fine-tuning.

RAGEval

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [69.5]
既存のRAGベンチマークは主に、大言語モデルが一般的な知識に正しく答えられるかどうかを評価することに焦点を当てている。本稿では,評価データセットを自動生成するフレームワークであるRAGEvalを紹介する。 LLMが生み出す応答を慎重に評価するために, 完全性, 幻覚, 不適切性の3つの新しい指標を提案する。
論文参考訳（メタデータ） (Fri, 02 Aug 2024 13:35:11 GMT)
RAGを評価するベンチマークの自動生成フレームワーク。DRAGONBall dataset（Diverse RAG Omni-Benchmark for All domains）って・・・。
分析結果から見えるGenerate、Retrieverそれぞれのモデルの性能が興味深い。結論には「Notably, while GPT-4o showed superior performance overall, the gap with top-performing open-source models was relatively small.」という指摘も。

Retrieval-Augmented Generation for Natural Language Processing: A Survey

Retrieval-Augmented Generation for Natural Language Processing: A Survey [25.1]
検索強化生成(RAG)は、外部知識データベースを利用して大きな言語モデルを拡張する。本稿では,RAGの重要技術,特に検索器と検索融合について概説する。 RAGは、自然言語処理のタスクや産業シナリオで使われる。
論文参考訳（メタデータ） (Thu, 18 Jul 2024 06:06:53 GMT)
実用上重要なRAGのサーベイ。
構成要素の選択肢が多く、整理された情報はとてもありがたい。

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems [124.8]
我々は、文書のHaystackを合成する手順を設計し、特定のテキストが文書間で繰り返されることを保証します。すると、”Summary of a Haystack”(SummHay)タスクは、Haystackを処理し、クエリ、関連する洞察を特定し、ソースドキュメントを正確に引用する要約を生成するシステムを必要とする。
論文参考訳（メタデータ） (Mon, 01 Jul 2024 15:23:42 GMT)
長文・大量の文書を要約できるかに関する（合成データによる）SummHay ベンチマークを構築、様々なLLM及びRAGを比較した論文。「achieving strong coverage of key insights in a large corpus of text does not require retrieval, given a sufficiently capable long-context LLM.」、「for use-cases where citation quality is important, optimizing retrieval is paramount: it removes irrelevant documents from the summarizer’s context, narrowing and focusing options for citation.」とユースケースによってRAGの有効性が変わるよう。Gemini 1.5 ProはRAGなしでも相当有効に機能しているようなことも興味深い。Retrieveの戦略も複数比較されており参考になる。
リポジトリはGitHub – salesforce/summary-of-a-haystack: Codebase accompanying the Summary of a Haystack paper.

Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track

Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track [51.3]
RAGベースの検索システムを構築、テスト、視覚化、体系的に評価するためのアリーナを持つことが不可欠である。 TREC 2024 RAG Trackを提案する。
論文参考訳（メタデータ） (Mon, 24 Jun 2024 17:37:52 GMT)
すごい名前のRAG評価用ベンチマーク・フレームワーク
リポジトリはGitHub – castorini/ragnarok: Retrieval-Augmented Generation battle!

SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation

SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation [45.4]
本稿では,Self-Aware Knowledge Retrieval (SeaKR)を紹介する。 SeaKRは, LLMの自己認識不確かさを内部状態から抽出する適応RAGモデルである。複雑で単純な問合せ解答データセットを用いた実験により,SeaKRが既存の適応RAG法より優れていることが示された。
論文参考訳（メタデータ） (Thu, 27 Jun 2024 14:38:33 GMT)
「SEAKR activates retrieval when the LLMs present high self-aware uncertainty for generation.」という戦略のRAG。Agenticで複雑な動作でFLARE（Fugu-MT 論文翻訳(概要): Active Retrieval Augmented Generation (fugumt.com)）やDRAGIN（Fugu-MT 論文翻訳(概要): DRAGIN: Dynamic Retrieval Augmented Generation based on the Real-time Information Needs of Large Language Models (fugumt.com)）を上回る。
リポジトリはGitHub – THU-KEG/SeaKR

2025年7月
月	火	水	木	金	土	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31