{"id":7935,"date":"2025-12-15T16:11:00","date_gmt":"2025-12-15T07:11:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=7935"},"modified":"2025-12-13T22:14:09","modified_gmt":"2025-12-13T13:14:09","slug":"scaling-behavior-of-discrete-diffusion-language-models","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=7935","title":{"rendered":"Scaling Behavior of Discrete Diffusion Language Models"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>Scaling Behavior of Discrete Diffusion Language Models\u00a0<\/strong>[74.7]<br>\u96e2\u6563\u62e1\u6563\u8a00\u8a9e\u30e2\u30c7\u30eb(DLM)\u306e\u69d8\u3005\u306a\u30ce\u30a4\u30ba\u30bf\u30a4\u30d7\u306b\u5bfe\u3059\u308b\u30b9\u30b1\u30fc\u30ea\u30f3\u30b0\u6319\u52d5\u306b\u3064\u3044\u3066\u691c\u8a0e\u3059\u308b\u3002 \u5b9f\u9a13\u306e\u7d50\u679c,DLM\u306e\u30b9\u30b1\u30fc\u30ea\u30f3\u30b0\u6319\u52d5\u306f\u30ce\u30a4\u30ba\u306e\u7a2e\u985e\u306b\u3088\u3063\u3066\u5927\u304d\u304f\u7570\u306a\u308a,ALM\u3068\u306f\u304b\u306a\u308a\u7570\u306a\u308b\u3053\u3068\u304c\u308f\u304b\u3063\u305f\u3002 \u5747\u4e00\u62e1\u6563\u30e2\u30c7\u30eb\u30921022\u30c9\u30eb\u306eFLOP\u3067\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u3057\u305f10B\u30d1\u30e9\u30e1\u30fc\u30bf\u307e\u3067\u62e1\u5f35\u3057\u3001\u4e88\u6e2c\u3055\u308c\u305f\u30b9\u30b1\u30fc\u30ea\u30f3\u30b0\u6319\u52d5\u3092\u78ba\u8a8d\u3057\u3001\u73fe\u5728\u307e\u3067\u306b\u6700\u3082\u5e83\u304f\u77e5\u3089\u308c\u3066\u3044\u308b\u5747\u4e00\u62e1\u6563\u30e2\u30c7\u30eb\u3068\u3057\u305f\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2512.10858v1\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2512.10858v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Thu, 11 Dec 2025 17:54:10 GMT)<\/li>\n\n\n\n<li>\u6700\u8fd1\u7814\u7a76\u304c\u9032\u307f\u5fdc\u7528\u4e8b\u4f8b\u3082\u51fa\u3066\u304d\u3066\u3044\u308bDiffusion language model\u306b\u5bfe\u3057\u3066\u300cOur findings support the case for discrete diffusion language models (DLMs) as a viable alternative to autoregressive language models (ALMs), the prevalent paradigm. DLMs can resolve core limitations of ALMs, enabling parallel generation for improved throughput, possessing the ability to revise and self-correct previously generated tokens, providing trivial ways of scaling test-time compute, and now also showing signs of improved scaling behavior with increased training compute. All in all, we conclude that DLMs in general, and uniform diffusion in particular, are promising candidates for next-generation LLMs.\u300d\u3068\u4e3b\u5f35\u3002<\/li>\n\n\n\n<li>\u30ea\u30dd\u30b8\u30c8\u30ea\u306f<a href=\"https:\/\/github.com\/dvruette\/gidd-easydel\">GitHub &#8211; dvruette\/gidd-easydel<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[114,223],"class_list":["post-7935","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-diffusion-model","tag-llm"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7935","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7935"}],"version-history":[{"count":1,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7935\/revisions"}],"predecessor-version":[{"id":7936,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7935\/revisions\/7936"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7935"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7935"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7935"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}