AgenTracer: Who Is Inducing Failure in the LLM Agentic Systems?

AgenTracer: Who Is Inducing Failure in the LLM Agentic Systems? [71.2]
大規模言語モデル（LLM）を基盤としたエージェントシステムは、複数のモデルやツールを駆使して高い性能を発揮するが、その複雑性によりシステムの脆弱性が増し、誤動作が発生しやすくなる。これに対処するため、AgenTracerが提案され、失敗したマルチエージェントの軌跡を自動で注釈付けし、エラー診断が可能な新しい軽量トレーサーAgenTracer-8Bが開発された。このシステムは、既存の大規模言語モデルを上回る性能を示し、エージェントの自己修正や進化を促す実用的なフィードバックを提供する。
論文参考訳（メタデータ） (Wed, 03 Sep 2025 13:42:14 GMT)
LLM based agents開発で大問題となるtrajectory logsの分析に関する研究、「By introducing AgenTracer, we provide the first automated framework capable of systematically generating annotated failure trajectories, as well as AgenTracer-8B, a lightweight yet effective failure tracer that leverages multi-granular RL to achieve prevailing diagnostic accuracy.」とのこと。AgenTracer-8BはQWEN3-8BをPost traigninしたモデルでサイズの割にとても高性能とのこと。
プロジェクトサイトはAcademic Project Page、リポジトリはGitHub – bingreeky/AgenTracer: AgenTracer: A Lightweight Failure Attributor for Agentic Systems

コメントを残す

コメントを残す コメントをキャンセル