{"id":7555,"date":"2025-10-06T03:55:00","date_gmt":"2025-10-05T18:55:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=7555"},"modified":"2025-10-04T22:01:37","modified_gmt":"2025-10-04T13:01:37","slug":"rlad-training-llms-to-discover-abstractions-for-solving-reasoning-problems","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=7555","title":{"rendered":"RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems\u00a0"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems\u00a0<\/strong>[99.0]<br>\u554f\u984c\u304c\u767a\u751f\u3057\u305f\u3089\u3001\u8907\u6570\u306e\u62bd\u8c61\u5316\u3092\u63d0\u6848\u3067\u304d\u308b\u30e2\u30c7\u30eb\u3092\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u3057\u3001\u7d9a\u3044\u3066\u30bd\u30ea\u30e5\u30fc\u30b7\u30e7\u30f3\u69cb\u7bc9\u306e\u30a4\u30f3\u30bb\u30f3\u30c6\u30a3\u30d6\u3092\u4e0e\u3048\u308bRL\u3092\u4f5c\u308a\u307e\u3059\u3002 \u3053\u306e\u7d50\u679c\u3001RL\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u30d1\u30e9\u30c0\u30a4\u30e0\u306fRLAD\u3068\u547c\u3070\u308c\u3001\u62bd\u8c61\u5316\u30b8\u30a7\u30cd\u30ec\u30fc\u30bf\u3068\u30bd\u30ea\u30e5\u30fc\u30b7\u30e7\u30f3\u30b8\u30a7\u30cd\u30ec\u30fc\u30bf\u3092\u5171\u540c\u3067\u8a13\u7df4\u3059\u308b\u3002 \u6211\u3005\u306f\u3001\u5927\u898f\u6a21\u306a\u30c6\u30b9\u30c8\u4e88\u7b97\u3067\u591a\u304f\u306e\u30bd\u30ea\u30e5\u30fc\u30b7\u30e7\u30f3\u3092\u751f\u6210\u3059\u308b\u3088\u308a\u3082\u3001\u3088\u308a\u591a\u304f\u306e\u30c6\u30b9\u30c8\u6642\u9593\u8a08\u7b97\u3092\u62bd\u8c61\u5316\u306e\u751f\u6210\u306b\u5272\u308a\u5f53\u3066\u308b\u3053\u3068\u304c\u3001\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u306b\u6709\u76ca\u3067\u3042\u308b\u3053\u3068\u3092\u793a\u3057\u3066\u3044\u307e\u3059\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2510.02263v1\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2510.02263v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Thu, 02 Oct 2025 17:44:23 GMT)<\/li>\n\n\n\n<li>\u300cWe introduce reasoning abstractions: concise representations of procedural and factual knowledge that are expressed in natural language, as a means to broaden the reasoning strategies used by LLMs\u300d\u3068\u3044\u3046\u62bd\u8c61\u5316\u30e2\u30c7\u30eb\u3068\u3053\u306e\u51e6\u7406\u3092\u901a\u3059\u3053\u3068\u3067\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u304c\u4e0a\u304c\u308b\u3053\u3068\u3092\u78ba\u8a8d\u3002\u7d50\u679c\u3082\u9762\u767d\u3044\u304c\u300cWe tried training a single model to do both abstraction generation and solution generation, after a lightweight SFT on traces showing questions paired with abstractions and corresponding solutions, but we found this approach to very quickly lose the ability of proposing abstractions over the course of RL training.\u300d\u3068\u3044\u3046\u306e\u3082\u8208\u5473\u6df1\u3044\u3002\u306a\u3093\u3067\u306a\u3093\u3060\u308d\u3046\u3002\u3002\u3002<\/li>\n\n\n\n<li>\u30d7\u30ed\u30b8\u30a7\u30af\u30c8\u30b5\u30a4\u30c8\u306f<a href=\"https:\/\/cohenqu.github.io\/rlad.github.io\/\">RLAD<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[42],"class_list":["post-7555","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-autonomous-agent"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7555","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7555"}],"version-history":[{"count":1,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7555\/revisions"}],"predecessor-version":[{"id":7556,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7555\/revisions\/7556"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7555"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7555"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7555"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}