{"id":7285,"date":"2025-08-12T03:37:00","date_gmt":"2025-08-11T18:37:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=7285"},"modified":"2025-08-11T15:41:10","modified_gmt":"2025-08-11T06:41:10","slug":"self-questioning-language-models","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=7285","title":{"rendered":"Self-Questioning Language Models\u00a0"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>Self-Questioning Language Models\u00a0<\/strong>[51.8]<br>\u672c\u7a3f\u3067\u306f,\u63d0\u6848\u8005\u304c\u30c8\u30d4\u30c3\u30af\u3092\u4e0e\u3048\u3089\u308c,\u89e3\u7b54\u8005\u306b\u5bfe\u3059\u308b\u8cea\u554f\u3092\u751f\u6210\u3059\u308b\u975e\u5bfe\u79f0\u306a\u30bb\u30eb\u30d5\u30d7\u30ec\u30a4\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3092\u63d0\u6848\u3059\u308b\u3002 \u63d0\u6848\u8005\u3068\u89e3\u7b54\u8005\u306f\u3068\u3082\u306b\u5f37\u5316\u5b66\u7fd2\u3092\u901a\u3058\u3066\u8a13\u7df4\u3055\u308c\u308b\u3002 3\u6841\u306e\u4e57\u7b97\u3001OMEGA\u30d9\u30f3\u30c1\u30de\u30fc\u30af\u306e\u4ee3\u6570\u554f\u984c\u3001Codeforces\u306e\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u554f\u984c\u3067\u3042\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2508.03682v1\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2508.03682v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Tue, 05 Aug 2025 17:51:33 GMT)<\/li>\n\n\n\n<li>\u300cOur method leverages the intrinsic capabilities of large language models by casting them in dual roles of proposer and solver within an asymmetric self-play setup. By rewarding the generation of problems that are neither too easy nor too difficult, and by reinforcing answers via internal agreement or external verification, we demonstrate that models can meaningfully improve their reasoning skills through interaction with self-generated content alone.\u300d\u3068\u3044\u3046\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u306e\u63d0\u6848\u3002<a href=\"https:\/\/devneko.jp\/wordpress\/?p=7256\">R-Zero: Self-Evolving Reasoning LLM from Zero Data \u2013 arXiv\u6700\u65b0\u8ad6\u6587\u306e\u7d39\u4ecb<\/a>\u306b\u3082\u8fd1\u3044\u306a\u30fc\u3068\u601d\u3046\u3002<\/li>\n\n\n\n<li>\u30d7\u30ed\u30b8\u30a7\u30af\u30c8\u30b5\u30a4\u30c8\u306f<a href=\"https:\/\/self-questioning.github.io\/\">Self-Questioning Language Models<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[356],"class_list":["post-7285","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-self-x"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7285","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7285"}],"version-history":[{"count":1,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7285\/revisions"}],"predecessor-version":[{"id":7286,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7285\/revisions\/7286"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7285"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7285"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7285"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}