{"id":7256,"date":"2025-08-11T14:22:00","date_gmt":"2025-08-11T05:22:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=7256"},"modified":"2025-08-11T13:27:15","modified_gmt":"2025-08-11T04:27:15","slug":"r-zero-self-evolving-reasoning-llm-from-zero-data","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=7256","title":{"rendered":"R-Zero: Self-Evolving Reasoning LLM from Zero Data"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>R-Zero: Self-Evolving Reasoning LLM from Zero Data\u00a0<\/strong>[56.7]<br>\u81ea\u5df1\u9032\u5316\u578b\u5927\u898f\u6a21\u8a00\u8a9e\u30e2\u30c7\u30eb(LLM)\u306f\u3001\u81ea\u8eab\u306e\u7d4c\u9a13\u304b\u3089\u81ea\u5f8b\u7684\u306b\u751f\u6210\u3001\u7cbe\u88fd\u3001\u5b66\u7fd2\u3059\u308b\u3053\u3068\u3067\u3001\u8d85\u77e5\u6027\u3078\u306e\u30b9\u30b1\u30fc\u30e9\u30d6\u30eb\u306a\u30d1\u30b9\u3092\u63d0\u4f9b\u3059\u308b\u3002 \u3053\u306e\u3088\u3046\u306a\u30e2\u30c7\u30eb\u3092\u8a13\u7df4\u3059\u308b\u305f\u3081\u306e\u65e2\u5b58\u306e\u65b9\u6cd5\u306f\u3001\u3044\u307e\u3060\u306b\u81a8\u5927\u306a\u4eba\u70ba\u7684\u306a\u30bf\u30b9\u30af\u3084\u30e9\u30d9\u30eb\u306b\u5927\u304d\u304f\u4f9d\u5b58\u3057\u3066\u3044\u308b\u3002 R-Zero\u306f\u3001\u5b8c\u5168\u306b\u81ea\u5f8b\u7684\u306a\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3067\u3001\u30b9\u30af\u30e9\u30c3\u30c1\u304b\u3089\u72ec\u81ea\u306e\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u30c7\u30fc\u30bf\u3092\u751f\u6210\u3059\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2508.05004v1\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2508.05004v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Thu, 07 Aug 2025 03:38:16 GMT)<\/li>\n\n\n\n<li>\u300cwe propose R-Zero, a framework for training reasoning LLMs that can self-evolve from zero external data. In R-Zero, a single base model is initialized with two roles \u2013 a Challenger and a Solver that are independently optimized but co-evolve throughout the RL process.\u300d\u3001\u300cChallenger is rewarded for proposing tasks near the edge of the Solver\u2019s capability, and the Solver is rewarded for solving increasingly challenging tasks posed by the Challenger.\u300d\u3068\u3044\u3046GAN\u3063\u307d\u3044\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3002<\/li>\n\n\n\n<li>\u30ea\u30dd\u30b8\u30c8\u30ea\u306f<a href=\"https:\/\/github.com\/Chengsong-Huang\/R-Zero\">Chengsong-Huang\/R-Zero: codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https:\/\/www.arxiv.org\/pdf\/2508.05004)<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[356,390],"class_list":["post-7256","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-self-x","tag-synthetic-data"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7256","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7256"}],"version-history":[{"count":1,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7256\/revisions"}],"predecessor-version":[{"id":7257,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7256\/revisions\/7257"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7256"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7256"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7256"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}