ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios

ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios [8.3]
ViDoRe v3は、視覚的にリッチなドキュメントコーパス上のマルチタイプクエリを特徴とする総合マルチモーダルRAGベンチマークである。さまざまな専門家ドメインにまたがる10のデータセットをカバーしており、26,000のドキュメントページと3,099の人間認証クエリをペアにしている。
論文参考訳（メタデータ） (Tue, 13 Jan 2026 15:00:33 GMT)
「We introduce ViDoRe V3, a comprehensive multi- modal RAG benchmark featuring multi-type queries over visually rich document corpora. It covers 10 datasets across diverse professional domains, comprising 26,000 document pages paired with 3,099 human-verified queries, each available in 6 languages.」というベンチマーク。「Evaluating state-of-the-art RAG pipelines, we find that visual retrievers outperform textual ones, late interaction and textual reranking yield substantial gains, and visual context improves answer generation quality.」が意外。
リポジトリはvidore (Vidore)

コメントを残す

コメントを残す コメントをキャンセル