OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification [91.2]
本稿では、長い思考の連鎖から要約された結果の合理化過程を検証する、アウトカムベースプロセス検証(OPV)を提案する。 OPV は 76.3 と比較して F1 スコアが 83.1 の Qwen3-Max-Preview など,はるかに大きなオープンソースモデルよりも優れています。
論文参考訳（メタデータ） (Thu, 11 Dec 2025 15:47:38 GMT)
「We introduced the Outcome-based Process Verifier (OPV), which bridges outcome and process verification by operating on summarized solutions from long CoTs. Through an iterative active learning framework with expert annotations, OPV progressively improves its verification capabilities while minimizing annotation costs.」とCoT的な推論過程を検証するアプローチの提案。

コメントを残す

コメントを残す コメントをキャンセル