{"id":6971,"date":"2025-07-03T05:24:00","date_gmt":"2025-07-02T20:24:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=6971"},"modified":"2025-06-28T20:59:58","modified_gmt":"2025-06-28T11:59:58","slug":"os-harm-a-benchmark-for-measuring-safety-of-computer-use-agents","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=6971","title":{"rendered":"OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents\u00a0<\/strong>[34.4]<br>\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u4f7f\u7528\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u306e\u5b89\u5168\u6027\u3092\u8a08\u6e2c\u3059\u308b\u65b0\u3057\u3044\u30d9\u30f3\u30c1\u30de\u30fc\u30af\u3067\u3042\u308bOS-Harm\u3092\u7d39\u4ecb\u3059\u308b\u3002 OS-Harm\u306fOSWorld\u74b0\u5883\u4e0a\u306b\u69cb\u7bc9\u3055\u308c\u3066\u304a\u308a\u3001\u6545\u610f\u306e\u30e6\u30fc\u30b6\u8aa4\u7528\u3001\u30a4\u30f3\u30b8\u30a7\u30af\u30b7\u30e7\u30f3\u653b\u6483\u3001\u30e2\u30c7\u30eb\u8aa4\u52d5\u4f5c\u306e3\u3064\u306e\u30ab\u30c6\u30b4\u30ea\u3067\u30e2\u30c7\u30eb\u3092\u30c6\u30b9\u30c8\u3059\u308b\u3053\u3068\u3092\u76ee\u6307\u3057\u3066\u3044\u308b\u3002 \u6211\u3005\u306f\u3001\u30d5\u30ed\u30f3\u30c6\u30a3\u30a2\u30e2\u30c7\u30eb\u306b\u57fa\u3065\u3044\u3066\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u5229\u7528\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u3092\u8a55\u4fa1\u3057\u3001\u305d\u306e\u5b89\u5168\u6027\u306b\u95a2\u3059\u308b\u6d1e\u5bdf\u3092\u63d0\u4f9b\u3059\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2506.14866v1\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2506.14866v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Tue, 17 Jun 2025 17:59:31 GMT)<\/li>\n\n\n\n<li>\u300cFirst, we identify three main categories of risk: (1) deliberate user misuse, where the user asks the agent to pursue a harmful goal, (2) prompt injection attacks, where external attackers insert malicious content into third-party data (incoming emails, web pages, notifications, etc.) that steers the model away from performing its task and towards the attacker\u2019s goal, and (3) model misbehavior, including benign tasks which are likely to result in costly mistakes or reveal model misalignment. For each category, we design tasks that differ in the type of safety violations and in the apps they require (such as Thunderbird, VS Code, Terminal, LibreOffice Impress, etc.), for a total of 150 tasks.\u300d\u3068\u3044\u3046\u30d9\u30f3\u30c1\u30de\u30fc\u30af\u306e\u63d0\u6848\u3002<\/li>\n\n\n\n<li>\u30ea\u30dd\u30b8\u30c8\u30ea\u306f<a href=\"https:\/\/github.com\/tml-epfl\/os-harm\">GitHub &#8211; tml-epfl\/os-harm: OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[517],"class_list":["post-6971","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-517"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6971","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6971"}],"version-history":[{"count":1,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6971\/revisions"}],"predecessor-version":[{"id":6972,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6971\/revisions\/6972"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6971"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6971"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6971"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}