UI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time Grounding

UI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time Grounding [16.9]
トレーニングと推論の両方においてGUIエージェントを強化するUI-AGILEを導入する。トレーニングのために,スーパービジョン・ファイン・チューニング(SFT)プロセスの一連の改善を提案する。推測のために,高解像度ディスプレイのグラウンド化精度を劇的に向上させるために,選択による分解グラウンド化を提案する。
論文参考訳（メタデータ） (Sat, 09 Aug 2025 17:51:27 GMT)
GUIエージェントの性能に大きく影響するグラウンディング能力を強化するフレームワークの提案。「UI-AGILE enhances GUI agents through improved training with a Continuous Reward function, Simple Thinking reward, and Cropping-based Resampling, and inference with Decomposed Grounding with Selection.」とのこと。
リポジトリはGitHub – KDEGroup/UI-AGILE

コメントを残す

コメントを残す コメントをキャンセル