2026年2月17日 – arXiv最新論文の紹介

POINTS-GUI-G: GUI-Grounding Journey

POINTS-GUI-G: GUI-Grounding Journey [22.4]
POINTS-GUIG-8Bは、ScreenSpotProで59.9、OSWorld-Gで66.0、ScreenSpot-v2で95.7、UIVisionで49.9のスコアで最先端のパフォーマンスを実現する。モデルの成功は,(1)データ工学の精錬,(2)訓練戦略の改善,(3)検証されたリワードによる強化学習の3つの要因によって引き起こされる。
論文参考訳（メタデータ） (Fri, 06 Feb 2026 05:14:11 GMT)
GUI groundingで良い性能を出す小型モデルの提案。「(1) Refined Data Engineering, involving the unification of diverse open-source datasets format alongside sophisticated strategies for augmentation, filtering, and difficulty grading; (2) Improved Training Strategies, including continuous fine-tuning of the vision encoder to enhance perceptual accuracy and maintaining resolution consistency between training and inference; and (3) Reinforcement Learning (RL) with Verifiable Rewards.」と構築過程も参考になる。
リポジトリはGitHub – Tencent/POINTS-GUI

UI-Mem: Self-Evolving Experience Memory for Online Reinforcement Learning in Mobile GUI Agents [50.1]
オンライン強化学習(RL)は、直接的な環境相互作用を通じてGUIエージェントを強化するための有望なパラダイムを提供する。階層的エクスペリエンスメモリによるGUIオンラインRLを強化する新しいフレームワークであるUI-Memを提案する。 UI-Memは従来のRLベースラインや静的再利用戦略よりも大幅に優れています。
論文参考訳（メタデータ） (Thu, 05 Feb 2026 16:21:43 GMT)
「constructs a hierarchical, self-evolving memory that decom- poses raw experiences into reusable workflows, subtask skills, and failure patterns. We utilized this memory through a stratified group sampling mechanism tailored for GRPO, which balances memory-guided exploitation with necessary exploration to facilitate effective advantage estimation.」とGUIエージェントのためのメモリ機能提案。
リポジトリはUI-Mem: Self-Evolving Experience Memory for Online Reinforcement Learning in Mobile GUI Agents

UI-Venus-1.5 Technical Report [64.5]
We present UI-Venus-1.5, an unified, end-to-end GUI Agent。提案したモデルファミリーは、2つの高密度変種(2Bと8B)と1つの混合専門家変種(30B-A3B)からなる。さらに、UI-Venus-1.5は、さまざまな中国のモバイルアプリで堅牢なナビゲーション機能を示している。
論文参考訳（メタデータ） (Mon, 09 Feb 2026 18:43:40 GMT)
UI Venusのver 1.5、「 Unified Single-Agent via Model Merging: A major distinction from UI-Venus-1.0 is that UI-Venus-1.5 is a purely end-to-end model, which greatly simplifies deployment for users.」と1.5と言っているがだいぶ異なるように思える。
リポジトリはGitHub – inclusionAI/UI-Venus: UI-Venus is a native UI agent designed to perform precise GUI element grounding and effective navigation using only screenshots as input.