{"id":6136,"date":"2025-01-31T05:21:00","date_gmt":"2025-01-30T20:21:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=6136"},"modified":"2025-01-31T05:21:00","modified_gmt":"2025-01-30T20:21:00","slug":"ui-tars-pioneering-automated-gui-interaction-with-native-agents","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=6136","title":{"rendered":"UI-TARS: Pioneering Automated GUI Interaction with Native Agents"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>UI-TARS: Pioneering Automated GUI Interaction with Native Agents\u00a0<\/strong>[58.2]<br>\u672c\u7a3f\u3067\u306f,GUI\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u306e\u30cd\u30a4\u30c6\u30a3\u30d6\u30e2\u30c7\u30eb\u3067\u3042\u308bUI-TARS\u3092\u7d39\u4ecb\u3059\u308b\u3002 OSWorld\u30d9\u30f3\u30c1\u30de\u30fc\u30af\u3067\u306f\u3001UI-TARS\u306f\u30b9\u30b3\u30a2\u304c24.6\u300150\u30b9\u30c6\u30c3\u30d7\u304c22.7\u300115\u30b9\u30c6\u30c3\u30d7\u304c22.7\u3067\u30af\u30ed\u30fc\u30c9(\u305d\u308c\u305e\u308c22.0\u306814.9)\u3092\u4e0a\u56de\u3063\u3066\u3044\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2501.12326v1\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2501.12326v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Tue, 21 Jan 2025 17:48:10 GMT)<\/li>\n\n\n\n<li>GUI\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u3001UI-TARS\u306e\u63d0\u6848\u3001\u69d8\u3005\u306a\u30bf\u30b9\u30af\u3067SOTA\u3092\u4e3b\u5f35\u3002\u300cUI-TARS incorporates several key innovations: (1) Enhanced Perception: leveraging a large-scale dataset of GUI screenshots for contextaware understanding of UI elements and precise captioning; (2) Unified Action Modeling, which standardizes actions into a unified space across platforms and achieves precise grounding and interaction through large-scale action traces; (3) System-2 Reasoning, which incorporates deliberate reasoning into multi-step decision making, involving multiple reasoning patterns such as task decomposition, reflection thinking, milestone recognition, etc. (4) Iterative Training with Reflective Online Traces, which addresses the data bottleneck by automatically collecting, filtering, and reflectively refining new interaction traces on hundreds of virtual machines.\u300d\u3068\u3084\u308c\u308b\u3053\u3068\u306f\u76db\u308a\u8fbc\u3093\u3060\u611f\u304c\u3059\u3054\u3044\u3002<\/li>\n\n\n\n<li>\u30ea\u30dd\u30b8\u30c8\u30ea\u306f<a href=\"https:\/\/github.com\/bytedance\/UI-TARS\">GitHub &#8211; bytedance\/UI-TARS<\/a><\/li>\n<\/ul>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[42,180,181,232],"class_list":["post-6136","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-autonomous-agent","tag-gui","tag-gui-agent","tag-lrm"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6136","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6136"}],"version-history":[{"count":0,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6136\/revisions"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6136"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6136"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6136"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}