DocR1: Evidence Page-Guided GRPO for Multi-Page Document Understanding

DocR1: Evidence Page-Guided GRPO for Multi-Page Document Understanding [97.4]
本稿では,新しいRLフレームワークであるEvidence Page-Guided GRPOで学習したMLLMであるDocR1を紹介する。 EviGRPOには、粗大な推論戦略を促進するエビデンス対応報酬機構が組み込まれている。我々は,DocR1が複数ページのタスクに対して最先端のパフォーマンスを達成し,シングルページのベンチマークにおいて強い結果を維持していることを示す。
論文参考訳（メタデータ） (Sun, 10 Aug 2025 12:03:45 GMT)
多くのページがあるドキュメント読解のためのフレームワークの提案。
「When engaging in multi-page reading comprehension, humans typically begin by identifying the pages likely to contain the answer, and then focus on locating the specific regions that correspond to the question and answer within those pages. Inspired by this “coarse-to-fine” reading strategy, EviGRPO mimics the human approach by first selecting a small set of potentially relevant pages at a coarse level, followed by fine-grained reasoning over the selected content.」とのことだが、このようなドメイン（タスク）特化のアプローチはいまだ有効なんだろうか。。

コメントを残す

コメントを残す コメントをキャンセル