{"id":6660,"date":"2025-05-05T05:23:00","date_gmt":"2025-05-04T20:23:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=6660"},"modified":"2025-05-05T05:23:00","modified_gmt":"2025-05-04T20:23:00","slug":"qwen3-phi-4-reasoning-mimo-7b-olmo2-1b-mellum-4b","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=6660","title":{"rendered":"Qwen3, Phi-4 reasoning, MiMo 7B, OLMo2 1B, Mellum 4B"},"content":{"rendered":"\n<p>\u5148\u9031\u306f\u30aa\u30fc\u30d7\u30f3\u306a\u30e2\u30c7\u30eb\u306e\u30cb\u30e5\u30fc\u30b9\u304c\u591a\u304b\u3063\u305f\u3002\u305d\u306e\u4e2d\u3067\u3082Qwen3\u306f\u5927\u304d\u306a\u30cb\u30e5\u30fc\u30b9\u3067\u3042\u308b\uff08<a href=\"https:\/\/qwenlm.github.io\/blog\/qwen3\/\">Qwen3: Think Deeper, Act Faster | Qwen<\/a>\uff09\u3002MoE\u306aQwen3-235B-A22B,  Qwen3-30B-A3B\u306e\u4ed6\u3001dense\u306aQwen3-32B, Qwen3-14B, Qwen3-8B, Qwen3-4B, Qwen3-1.7B, Qwen3-0.6B\u304c\u516c\u958b\u3055\u308c\u3066\u3044\u308b\uff08<a href=\"https:\/\/huggingface.co\/collections\/Qwen\/qwen3-67dd247413f0e2e4f653967f\">Qwen3 &#8211; a Qwen Collection<\/a>\uff09\u3002\u30e9\u30a4\u30bb\u30f3\u30b9\u306fApache-2\u3002\u307e\u305f\u3001Microsoft\u306ePhi-4\u306ereasoning\u30e2\u30c7\u30eb\u516c\u958b\uff08<a href=\"https:\/\/techcommunity.microsoft.com\/blog\/educatordeveloperblog\/showcasing-phi-4-reasoning-a-game-changer-for-ai-developers\/4409892\">Showcasing Phi-4-Reasoning: A Game-Changer for AI Developers | Microsoft Community Hub<\/a>\u3001<a href=\"https:\/\/huggingface.co\/microsoft\">huggingface<\/a>\uff09\u3082\u6ce8\u76ee\u3002<\/p>\n\n\n\n<p>SLM\u306e\u767a\u8868\u3082\u591a\u304f\u3001Xiaomi\u306b\u3088\u308aMiMo\uff08<a href=\"https:\/\/github.com\/XiaomiMiMo\/MiMo\">GitHub &#8211; XiaomiMiMo\/MiMo: MiMo: Unlocking the Reasoning Potential of Language Model \u2013 From Pretraining to Posttraining<\/a>\uff09\u3001Ai2\u306b\u3088\u308b<a href=\"https:\/\/allenai.org\/olmo\/release-notes#olmo-2-1b\">OLMo release notes | Ai2<\/a>\u304c\u8208\u5473\u6df1\u3044\u3002JetBrain\u306b\u3088\u308bMellum\uff08<a href=\"https:\/\/blog.jetbrains.com\/ai\/2025\/04\/mellum-goes-open-source-a-purpose-built-llm-for-developers-now-on-hugging-face\/\">Mellum Goes Open Source: A Purpose-Built LLM for Developers, Now on Hugging Face | The JetBrains Blog<\/a>\uff09\u306f\u300cMellum doesn\u2019t try to know everything. It\u2019s designed to do one thing really well: code completion. We call it a focal model \u2013 built with purposeful depth and not concerned with chasing breadth.\u300d\u3068\u3042\u308b\u901a\u308a\u7279\u5316\u578b\u3002\u73fe\u72b6\u3001Mellum\u306f\u5341\u5206\u306a\u6027\u80fd\u3068\u306f\u8a00\u3044\u96e3\u3044\u3082\u306e\u306e\u3001SLM\u3092\u7279\u5316\u3057\u3066\u5f37\u5316\u3059\u308b\u3001\u30b3\u30b9\u30d1\u3092\u4e0a\u3052\u308b\u65b9\u5411\u306f\u6709\u671b\u3002DeepseekProver-V2\u306e671B\u306f\u51c4\u3044\u304c\u30017B\u306e\u3046\u307e\u3044\u6d3b\u7528\u306e\u3088\u3046\u306a\u7d44\u307f\u5408\u308f\u305b\u3082\u91cd\u8981\u306b\u306a\u308b\u3068\u601d\u3046\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Phi-4-reasoning Technical Report&nbsp;<\/strong>[42.5]<br>Phi-4-reasoning\u306f14\u30d3\u30ea\u30aa\u30f3\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u63a8\u8ad6\u30e2\u30c7\u30eb\u3067\u3042\u308a\u3001\u8907\u96d1\u306a\u63a8\u8ad6\u30bf\u30b9\u30af\u306b\u304a\u3044\u3066\u9ad8\u3044\u6027\u80fd\u3092\u5b9f\u73fe\u3059\u308b\u3002 \u6211\u3005\u306fPhi-4-reasoning-plus\u3092\u958b\u767a\u3057\u305f\u3002 \u3069\u3061\u3089\u306e\u30e2\u30c7\u30eb\u3082DeepSeek-R1-Distill-Llama-70B\u30e2\u30c7\u30eb\u306e\u3088\u3046\u306a\u5927\u304d\u306a\u30aa\u30fc\u30d7\u30f3\u30a6\u30a7\u30a4\u30c8\u30e2\u30c7\u30eb\u3088\u308a\u3082\u512a\u308c\u3066\u304a\u308a\u3001\u5b8c\u5168\u306aDeepSeek-R1\u30e2\u30c7\u30eb\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u30ec\u30d9\u30eb\u306b\u8fd1\u3065\u3044\u3066\u3044\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2504.21318v1\">\u8ad6\u6587<\/a>&nbsp;&nbsp;<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2504.21318v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>&nbsp; &nbsp;(Wed, 30 Apr 2025 05:05:09 GMT)<\/li>\n\n\n\n<li>Phi-4\u30b7\u30ea\u30fc\u30ba\u306eLRM<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math&nbsp;<\/strong>[135.1]<br>CoT(Chain-of-Thought)\u306f\u5927\u898f\u6a21\u8a00\u8a9e\u30e2\u30c7\u30eb(LLM)\u306e\u5f62\u5f0f\u63a8\u8ad6\u80fd\u529b\u3092\u8457\u3057\u304f\u5411\u4e0a\u3055\u305b\u308b \u3057\u304b\u3057\u3001Small Language Models (SLM) \u306b\u304a\u3051\u308b\u63a8\u8ad6\u306e\u6539\u5584\u306f\u3001\u30e2\u30c7\u30eb\u80fd\u529b\u304c\u9650\u3089\u308c\u3066\u3044\u308b\u305f\u3081\u3001\u4f9d\u7136\u3068\u3057\u3066\u56f0\u96e3\u3067\u3042\u308b\u3002 \u672c\u7814\u7a76\u3067\u306f,(1)\u591a\u7a2e\u591a\u69d8\u306a\u84b8\u7559\u9577CoT\u30c7\u30fc\u30bf\u306b\u3088\u308b\u5927\u898f\u6a21\u4e2d\u7b49\u6559\u80b2,(2)\u9ad8\u54c1\u8cea\u9577CoT\u30c7\u30fc\u30bf\u306b\u3088\u308b\u5fae\u8abf\u6574,(3)\u53b3\u683c\u306a\u9078\u597d\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u6d3b\u7528\u3057\u305f\u30ed\u30fc\u30eb\u30a2\u30a6\u30c8DPO,(4)\u691c\u8a3c\u30ea\u30ef\u30fc\u30c9\u3092\u7528\u3044\u305f\u5f37\u5316\u5b66\u7fd2(RL)\u306e4\u6bb5\u968e\u304b\u3089\u306a\u308bSLM\u306e\u4f53\u7cfb\u7684\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u30ec\u30b7\u30d4\u3092\u63d0\u6848\u3059\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2504.21233v1\">\u8ad6\u6587<\/a>&nbsp;&nbsp;<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2504.21233v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>&nbsp; &nbsp;(Wed, 30 Apr 2025 00:04:35 GMT)<\/li>\n\n\n\n<li>SLM\u3092\u5229\u7528\u3057\u305freasoning\u30e2\u30c7\u30eb\u306e\u69cb\u7bc9\u3002\u300cThe resulting Phi-4-Mini-Reasoning model exceeds, on math reasoning tasks, much larger reasoning models, e g , outperforming DeepSeek-R1-Distill-Qwen-7B by 3.2 points and DeepSeek-R1-DistillLlama-8B by 7.7 points on Math-500.\u300d\u3068\u52b9\u679c\u3092\u78ba\u8a8d\u3068\u306e\u3053\u3068\u3002<\/li>\n\n\n\n<li>\u5c0f\u578b\u306e\u30e2\u30c7\u30eb\u3067\u3042\u3063\u3066\u3082reasoning\u304c\u6709\u52b9\u3068\u3044\u3046\u8208\u5473\u6df1\u3044\u7d50\u679c\u3002<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition&nbsp;<\/strong>[24.5]<br>\u6211\u3005\u306fDeepSeek-Prover-V2\u3092\u7d39\u4ecb\u3057\u307e\u3059\u3002 \u3053\u306e\u30e2\u30c7\u30eb\u306f\u3001\u30cb\u30e5\u30fc\u30e9\u30eb\u5b9a\u7406\u306e\u8a3c\u660e\u306b\u304a\u3051\u308b\u6700\u5148\u7aef\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u9054\u6210\u3057\u3001\u30df\u30cbF2F\u30c6\u30b9\u30c8\u306788.9%\u306e\u30d1\u30b9\u6bd4\u306b\u9054\u3057\u3001PutnamBench\u306e658\u554f\u984c\u306e\u3046\u306149\u3092\u89e3\u6c7a\u3057\u305f\u3002 \u6a19\u6e96\u30d9\u30f3\u30c1\u30de\u30fc\u30af\u306b\u52a0\u3048\u3066\u3001325\u306e\u5f62\u5f0f\u5316\u3055\u308c\u305f\u554f\u984c\u306e\u96c6\u5408\u3067\u3042\u308bProverBench\u3092\u5c0e\u5165\u3057\u3001\u6700\u8fd1\u306eAIME\u30b3\u30f3\u30da\u30c6\u30a3\u30b7\u30e7\u30f3\u304b\u3089\u9078\u629e\u3055\u308c\u305f15\u306e\u554f\u984c\u3092\u542b\u3080\u8a55\u4fa1\u3092\u5f37\u5316\u3057\u305f\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2504.21801v1\">\u8ad6\u6587<\/a>&nbsp;&nbsp;<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2504.21801v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>&nbsp; &nbsp;(Wed, 30 Apr 2025 16:57:48 GMT)<\/li>\n\n\n\n<li>\u300cWe first prompt DeepSeek-V3 to generate a natural-language proof sketch while simultaneously formalizing it into a Lean statement with sorry placeholders for omitted proof details. A 7B prover model then recursively solves the decomposed subgoals. By combining these subgoal proofs, we construct a complete formal proof for the original complex problem.This composed proof is appended to DeepSeek-V3\u2019s original chain-of-thought, creating high-quality cold-start training data for formal mathematical reasoning. \u300d<\/li>\n\n\n\n<li>\u30ea\u30dd\u30b8\u30c8\u30ea\u306f<a href=\"https:\/\/github.com\/deepseek-ai\/DeepSeek-Prover-V2\">GitHub &#8211; deepseek-ai\/DeepSeek-Prover-V2<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>\u5148\u9031\u306f\u30aa\u30fc\u30d7\u30f3\u306a\u30e2\u30c7\u30eb\u306e\u30cb\u30e5\u30fc\u30b9\u304c\u591a\u304b\u3063\u305f\u3002\u305d\u306e\u4e2d\u3067\u3082Qwen3\u306f\u5927\u304d\u306a\u30cb\u30e5\u30fc\u30b9\u3067\u3042\u308b\uff08Qwen3: Think Deeper, Act Faster | Qwen\uff09\u3002MoE\u306aQwen3-235B-A22B, Qwen3 &hellip; <a href=\"https:\/\/devneko.jp\/wordpress\/?p=6660\" class=\"more-link\"><span class=\"screen-reader-text\">&#8220;Qwen3, Phi-4 reasoning, MiMo 7B, OLMo2 1B, Mellum 4B&#8221; \u306e<\/span>\u7d9a\u304d\u3092\u8aad\u3080<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[223,232,365,366,368],"class_list":["post-6660","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-llm","tag-lrm","tag-slm","tag-slow-thinking","tag-small-model"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6660","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6660"}],"version-history":[{"count":1,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6660\/revisions"}],"predecessor-version":[{"id":6700,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6660\/revisions\/6700"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6660"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6660"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6660"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}