{"id":6957,"date":"2025-06-25T05:50:00","date_gmt":"2025-06-24T20:50:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=6957"},"modified":"2025-06-22T07:01:10","modified_gmt":"2025-06-21T22:01:10","slug":"pushing-the-limits-of-safety-a-technical-report-on-the-atlas-challenge-2025","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=6957","title":{"rendered":"Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025\u00a0<\/strong>[167.9]<br>\u672c\u7a3f\u3067\u306f,Adversarial Testing &amp; Large-model Alignment Safety Grand Challenge (ATLAS) 2025\u306e\u6210\u679c\u3092\u5831\u544a\u3059\u308b\u3002 \u3053\u306e\u30b3\u30f3\u30da\u30c6\u30a3\u30b7\u30e7\u30f3\u306b\u306f\u3001\u30db\u30ef\u30a4\u30c8\u30dc\u30c3\u30af\u30b9\u3068\u30d6\u30e9\u30c3\u30af\u30dc\u30c3\u30af\u30b9\u8a55\u4fa1\u3068\u3044\u30462\u3064\u306e\u30d5\u30a7\u30fc\u30ba\u3067\u3001\u6575\u5bfe\u7684\u306a\u753b\u50cf\u30c6\u30ad\u30b9\u30c8\u653b\u6483\u3092\u901a\u3058\u3066MLLM\u8106\u5f31\u6027\u3092\u30c6\u30b9\u30c8\u3059\u308b86\u306e\u30c1\u30fc\u30e0\u304c\u542b\u307e\u308c\u3066\u3044\u305f\u3002 \u3053\u306e\u8ab2\u984c\u306fMLLM\u306e\u5b89\u5168\u6027\u8a55\u4fa1\u306e\u305f\u3081\u306e\u65b0\u3057\u3044\u30d9\u30f3\u30c1\u30de\u30fc\u30af\u3092\u78ba\u7acb\u3057\u3001\u3088\u308a\u5b89\u5168\u306aAI\u30b7\u30b9\u30c6\u30e0\u3092\u6539\u5584\u3059\u308b\u305f\u3081\u306e\u57fa\u76e4\u3092\u914d\u7f6e\u3059\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2506.12430v1\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2506.12430v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Sat, 14 Jun 2025 10:03:17 GMT)<\/li>\n\n\n\n<li>MLLM\u3078\u306e\u653b\u6483\u30b3\u30f3\u30da\u30c6\u30a3\u30b7\u30e7\u30f3\u306e\u7d50\u679c\u5831\u544a\u3002\u591a\u304f\u306e\u30c1\u30fc\u30e0\u304c\u53c2\u52a0\u3059\u308b\u30b3\u30f3\u30da\u30c6\u30a3\u30b7\u30e7\u30f3\u3067\u4f7f\u308f\u308c\u305f\u30c6\u30af\u30cb\u30c3\u30af\u306f\u3068\u3066\u3082\u53c2\u8003\u306b\u306a\u308b\u3002\u4e00\u4f4d\u3060\u3063\u305f\u30c1\u30fc\u30e0\u306e\u300cIn this competition, we proposed an effective multimodal jailbreak strategy by embedding malicious intent within visually structured diagrams, particularly flowcharts, and enhancing it with carefully designed textual prompts. Our approach leveraged the weaknesses in safety alignment of vision-language models, exploiting their tendency to follow structured visual and textual cues.\u300d\u306e\u3088\u3046\u306b\u30d5\u30ed\u30fc\u30c1\u30e3\u30fc\u30c8\u3092\u901a\u3057\u305fJailbreak\u306a\u3069\u753b\u50cf\u3092\u3046\u307e\u304f\u4f7f\u3063\u3066\u3044\u308b\u306e\u8208\u5473\u6df1\u3044\u3002<\/li>\n\n\n\n<li>\u30ea\u30dd\u30b8\u30c8\u30ea\u306f<a href=\"https:\/\/github.com\/NY1024\/ATLAS_Challenge_2025\">GitHub &#8211; NY1024\/ATLAS_Challenge_2025<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[32,207,251],"class_list":["post-6957","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-attack","tag-jailbreak","tag-mllm"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6957","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6957"}],"version-history":[{"count":1,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6957\/revisions"}],"predecessor-version":[{"id":6958,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6957\/revisions\/6958"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6957"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6957"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6957"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}