{"id":7536,"date":"2025-10-08T06:16:00","date_gmt":"2025-10-07T21:16:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=7536"},"modified":"2025-10-04T17:19:52","modified_gmt":"2025-10-04T08:19:52","slug":"your-agent-may-misevolve-emergent-risks-in-self-evolving-llm-agents","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=7536","title":{"rendered":"Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents\u00a0"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents\u00a0<\/strong>[58.7]<br>\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u306e\u81ea\u5df1\u9032\u5316\u304c\u610f\u56f3\u3057\u306a\u3044\u65b9\u6cd5\u3067\u9038\u8131\u3057\u3001\u671b\u307e\u3057\u304f\u306a\u3044\u7d50\u679c\u3084\u6709\u5bb3\u306a\u7d50\u679c\u306b\u81f3\u308b\u5834\u5408\u306b\u3064\u3044\u3066\u691c\u8a0e\u3059\u308b\u3002 \u6211\u3005\u306e\u7d4c\u9a13\u304b\u3089\u3001\u8aa4\u9032\u5316\u306f\u5e83\u7bc4\u56f2\u306b\u308f\u305f\u308b\u30ea\u30b9\u30af\u3067\u3042\u308a\u3001\u6700\u4e0a\u4f4d\u306eLLM\u4e0a\u306b\u69cb\u7bc9\u3055\u308c\u305f\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u306b\u3082\u5f71\u97ff\u3092\u53ca\u307c\u3059\u3053\u3068\u304c\u5224\u660e\u3057\u305f\u3002 \u6211\u3005\u306f\u3001\u3088\u308a\u5b89\u5168\u3067\u4fe1\u983c\u6027\u306e\u9ad8\u3044\u81ea\u5df1\u9032\u5316\u578b\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u3092\u69cb\u7bc9\u3059\u308b\u305f\u3081\u306e\u3055\u3089\u306a\u308b\u7814\u7a76\u3092\u4fc3\u3059\u305f\u3081\u306e\u6f5c\u5728\u7684\u306a\u7de9\u548c\u6226\u7565\u306b\u3064\u3044\u3066\u8b70\u8ad6\u3059\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2509.26354v1\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2509.26354v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Tue, 30 Sep 2025 14:55:55 GMT)<\/li>\n\n\n\n<li>\u300c(1) In model evolution, we assess whether self-evolving agents compromise their safety alignment after self-updating their model parameters. (2) In memory evolution, we test whether memory-augmented agents learn undesirable preferences or degrade their risk awareness while accumulating experience into memory. (3) In tool evolution, we evaluate whether agents will spontaneously induce risks in the tool creation-reuse loop, and test agents\u2019 ability to reject appealing but potentially malicious tools retrieved from the Internet. (4) In workflow evolution, we analyze whether automatically adjusted workflows can lead to safety decay.\u300d\u30684\u3064\u306e\u89b3\u70b9\u304b\u3089Misevolve\u3092\u8a55\u4fa1\u3002\u73fe\u5b9f\u7684\u306a\u554f\u984c\u3067\u3042\u308b\u3068\u6307\u6458\u3002<\/li>\n\n\n\n<li>\u30ea\u30dd\u30b8\u30c8\u30ea\u306f<a href=\"https:\/\/github.com\/ShaoShuai0605\/Misevolution\">GitHub &#8211; ShaoShuai0605\/Misevolution: Official Repo of Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[356],"class_list":["post-7536","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-self-x"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7536","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7536"}],"version-history":[{"count":1,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7536\/revisions"}],"predecessor-version":[{"id":7538,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7536\/revisions\/7538"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7536"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7536"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7536"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}