UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

  • UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning [151.0]
    グラフィカルユーザインタフェースのための自律エージェントの開発は、人工知能における大きな課題を示している。 本稿では,GUI中心のエージェントモデルであるUI-TARS-2を提案する。 実証的な評価では、UI-TARS-2は以前のUI-TARS-1.5よりも大幅に改善されている。
    論文  参考訳(メタデータ)   (Tue, 02 Sep 2025 17:44:45 GMT)
  • UI-TARS: Pioneering Automated GUI Interaction with Native Agents – arXiv最新論文の紹介, UFO2: The Desktop AgentOS , UI-TARS-1.5 – arXiv最新論文の紹介のアップデート。「Empirical evaluation shows that UI-TARS-2 delivers significant improvements over UI-TARS-1.5 [56], achieving strong results in both GUI-based interaction and game environments. On GUI benchmarks, the model reaches 88.2 on Online-Mind2Web [77], 47.5 on OSWorld [75], 50.6 on WindowsAgentArena [10], and 73.3 on AndroidWorld [52], representing clear gains over the previous generation and outperforming strong baselines such as Claude and OpenAI agents in multiple cases.」と前回モデルに比べ大きな改善を主張。下記が改善点ということではあるが、最初のバージョンからやれることは全部やるという雰囲気がすごい
    • First, to mitigate data scarcity, we design a scalable Data Flywheel that co-evolves the model and its training corpus through continual pretraining, supervised fine-tuning, rejection sampling, and multiturn RL
    • Second, to overcome the difficulties of scalable multi-turn RL, we design a training framework that stabilizes optimization in long-horizon settings.
    • Third, to move beyond the limitations of pure GUI interaction, we construct a hybrid GUI-centered environment that augments on-screen actions with access to complementary resources such as file systems, terminals, and other external tools, enabling agents to solve a broader spectrum of realistic workflows.
    • Fourth, to support large-scale training and evaluation, we build a unified sandbox platform capable of orchestrating heterogeneous environments—ranging from cloud VMs for GUI interaction to browser-based sandboxes for games—under a consistent API.
  • リポジトリはGitHub – bytedance/UI-TARS

コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です