{"id":6370,"date":"2025-03-10T06:01:00","date_gmt":"2025-03-09T21:01:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=6370"},"modified":"2025-03-10T06:01:00","modified_gmt":"2025-03-09T21:01:00","slug":"qwq-32b-jamba-1-6-aya-vision-mistral-ocr","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=6370","title":{"rendered":"QwQ-32B, Jamba 1.6, RWKV7 G1, Aya Vision, Mistral OCR, DeepSeek Open Source Week"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">\u5148\u9031\u3082\u69d8\u3005\u306a\u30cb\u30e5\u30fc\u30b9\u304c\u3042\u3063\u305f\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">QwQ-32B\u306fDeepSeek-R1 (671B, Active 37B\uff09\u3068\u7af6\u5408\u3059\u308b\u6027\u80fd\u3092\u4e3b\u5f35\uff08<a href=\"https:\/\/qwenlm.github.io\/blog\/qwq-32b\/\">QwQ-32B: Embracing the Power of Reinforcement Learning | Qwen<\/a>\uff09\u3001\u300cThis remarkable outcome underscores the effectiveness of RL when applied to robust foundation models pretrained on extensive world knowledge.\u300d\u3068\u5f37\u5316\u5b66\u7fd2\u306e\u6709\u52b9\u6027\u3092\u611f\u3058\u308b\u3002<a href=\"https:\/\/devneko.jp\/wordpress\/?p=5816\">Model Context Protocol (MCP), QwQ, OLMo 2 \u2013 arXiv\u6700\u65b0\u8ad6\u6587\u306e\u7d39\u4ecb<\/a>\u3001<a href=\"https:\/\/qwenlm.github.io\/blog\/qwq-32b-preview\/\">QwQ: Reflect Deeply on the Boundaries of the Unknown | Qwen<\/a>\u306ePreview\u3088\u308a\u5927\u304d\u304f\u6027\u80fd\u304c\u4e0a\u304c\u3063\u3066\u3044\u308b\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Jamba 1.6\u306fMistral\u3084Llama\u3001Cohere\u306a\u3069\u7af6\u5408\u3092\u8d85\u3048\u308b\u6027\u80fd\u3092\u4e3b\u5f35\u3059\u308bLLM\uff08<a href=\"https:\/\/www.ai21.com\/jamba\/\">Jamba 1.6: The Best Open Model for Enterprise Deployment | AI21<\/a>\uff09\u3001SSM\uff0bTransformer\u306e\u30cf\u30a4\u30d6\u30ea\u30c3\u30c9\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u3067\u3042\u308a\u9ad8\u901f\u3068\u306e\u3053\u3068\uff08<a href=\"https:\/\/www.ai21.com\/blog\/introducing-jamba-1-6\/\">The Best Private LLM for Enterprise AI Deployment | AI21<\/a>\uff09\u3002Jamba Mini 1.6 (12B active\/52B total) and Jamba Large 1.6 (94B active\/398B total) \u306e\uff12\u30e2\u30c7\u30eb\u304c\u3042\u308a\u3001\u30ea\u30dd\u30b8\u30c8\u30ea\u304c\u516c\u958b\u3055\u308c\u3066\u3044\u308b\uff08<a href=\"https:\/\/huggingface.co\/collections\/ai21labs\/jamba-16-67c990671a26dcbfa62d18fa\">Jamba 1.6 &#8211; a ai21labs Collection<\/a>\uff09\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">RWKV\u3082Reasoning\u30e2\u30c7\u30eb<a href=\"https:\/\/huggingface.co\/BlinkDL\/rwkv7-g1\" target=\"_blank\" rel=\"noreferrer noopener\">RWKV7-G1<\/a> &#8220;GooseOne&#8221;\u3092\u51fa\u3057\u3066\u3044\u308b\uff08<a href=\"https:\/\/www.rwkv.com\/\">RWKV Language Model<\/a>, <a href=\"https:\/\/huggingface.co\/BlinkDL\/rwkv7-g1\">BlinkDL\/rwkv7-g1 \u00b7 Hugging Face<\/a>\uff09\u73fe\u72b6\u3067\u306f\u30e2\u30c7\u30eb\u306e\u898f\u6a21\u304c\u5c0f\u3055\u3044\u304c\u3001\u3088\u308a\u5927\u898f\u6a21\u306aReasoningModel\u304cRWKV\u306e\u3088\u3046\u306a\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u3067\u3082\u6709\u52b9\u304b\u306f\u6ce8\u8996\u3057\u305f\u3044\u3068\u3053\u308d\u3002\uff08\u72b6\u614b\u7a7a\u9593\u30e2\u30c7\u30eb\u3067LRM\u7684\u69cb\u6210\u304c\u6709\u52b9\u3068\u3044\u3046\u306e\u306f\u76f4\u611f\u306b\u53cd\u3059\u308b\u3088\u3046\u306a\u305d\u3046\u3067\u3082\u306a\u3044\u3088\u3046\u306a\u3082\u3084\u3082\u3084\u304c\u3042\u308b\u3002\u4eca\u5f8c\u306e\u767a\u5c55\u304c\u3068\u3066\u3082\u6c17\u306b\u306a\u308b\u3002\uff09<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Cohere\u306b\u3088\u308b\u30d1\u30e9\u30e1\u30fc\u30bf\u52b9\u7387\u304c\u826f\u3044\u30de\u30eb\u30c1\u30e2\u30fc\u30c0\u30eb\u30fb\u30de\u30eb\u30c1\u30ea\u30f3\u30ac\u30eb\u30e2\u30c7\u30ebAYA Vision \uff08<a href=\"https:\/\/cohere.com\/blog\/aya-vision\">Aya Vision: Expanding the worlds AI can see,<\/a> <a href=\"https:\/\/huggingface.co\/collections\/CohereForAI\/c4ai-aya-vision-67c4ccd395ca064308ee1484?ref=cohere-ai.ghost.io\">C4AI Aya Vision &#8211; a CohereForAI Collection<\/a>\uff09\u306e\u767a\u8868\u3082\u3042\u308a\u30ed\u30fc\u30ab\u30eb\u30fb\u30aa\u30f3\u30d7\u30ec\u30df\u30b9\u74b0\u5883\u3067\u52d5\u4f5c\u3059\u308b\u5f37\u529b\u306aLLM\u3001MLLM\u3082\u5897\u3048\u3066\u304d\u3066\u3044\u308b\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Mistral OCR\u306e\u767a\u8868\u306fDocument Understanding\u95a2\u9023\u3068\u3057\u3066\u6ce8\u76ee\u306e\u30cb\u30e5\u30fc\u30b9\uff08<a href=\"https:\/\/mistral.ai\/en\/news\/mistral-ocr\">Mistral OCR | Mistral AI<\/a>\uff09\u3002<a href=\"https:\/\/olmocr.allenai.org\/\">olmOCR \u2013 Open-Source OCR for Accurate Document Conversion<\/a>\u3067\u3082\u601d\u3063\u305f\u304cMLLM\u7cfb\u306eDocument Understanding\u3082\u5f37\u529b\u305d\u3046\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">DeepSeek\u306eOpen Source Week\u3067\u306f\u305d\u306e\u540d\u306e\u901a\u308a\u591a\u304f\u306e\u30e9\u30a4\u30d6\u30e9\u30ea\u304c\u516c\u958b\u3055\u308c\u305f\u3002\u30a4\u30f3\u30d5\u30e9\u5468\u308a\u306e\u30b3\u30fc\u30c9\u304c\u3068\u3066\u3082\u8208\u5473\u6df1\u3044\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/deepseek-ai\/open-infra-index\/tree\/main\">GitHub &#8211; deepseek-ai\/open-infra-index: Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation<\/a>\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/deepseek-ai\/FlashMLA\">GitHub &#8211; deepseek-ai\/FlashMLA: FlashMLA: Efficient MLA decoding kernels<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/deepseek-ai\/DeepEP\">GitHub &#8211; deepseek-ai\/DeepEP: DeepEP: an efficient expert-parallel communication library<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/deepseek-ai\/DeepGEMM\">GitHub &#8211; deepseek-ai\/DeepGEMM: DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/deepseek-ai\/EPLB\">GitHub &#8211; deepseek-ai\/EPLB: Expert Parallelism Load Balancer<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/deepseek-ai\/DualPipe\">GitHub &#8211; deepseek-ai\/DualPipe: A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3\/R1 training.<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/deepseek-ai\/profile-data\">GitHub &#8211; deepseek-ai\/profile-data: Analyze computation-communication overlap in V3\/R1.<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/deepseek-ai\/3FS\">GitHub &#8211; deepseek-ai\/3FS: A high-performance distributed file system designed to address the challenges of AI training and inference workloads.<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/deepseek-ai\/smallpond\">GitHub &#8211; deepseek-ai\/smallpond: A lightweight data processing framework built on DuckDB and 3FS.<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/deepseek-ai\/open-infra-index\/blob\/main\/202502OpenSourceWeek\/day_6_one_more_thing_deepseekV3R1_inference_system_overview.md\">open-infra-index\/202502OpenSourceWeek\/day_6_one_more_thing_deepseekV3R1_inference_system_overview.md at main \u00b7 deepseek-ai\/open-infra-index \u00b7 GitHub<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>\u5148\u9031\u3082\u69d8\u3005\u306a\u30cb\u30e5\u30fc\u30b9\u304c\u3042\u3063\u305f\u3002 QwQ-32B\u306fDeepSeek-R1 (671B, Active 37B\uff09\u3068\u7af6\u5408\u3059\u308b\u6027\u80fd\u3092\u4e3b\u5f35\uff08QwQ-32B: Embracing the Power of Reinforcement &hellip; <a href=\"https:\/\/devneko.jp\/wordpress\/?p=6370\" class=\"more-link\"><span class=\"screen-reader-text\">&#8220;QwQ-32B, Jamba 1.6, RWKV7 G1, Aya Vision, Mistral OCR, DeepSeek Open Source Week&#8221; \u306e<\/span>\u7d9a\u304d\u3092\u8aad\u3080<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[117,223,251,346],"class_list":["post-6370","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-document-understanding","tag-llm","tag-mllm","tag-rwkv"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6370","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6370"}],"version-history":[{"count":0,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/6370\/revisions"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6370"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6370"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6370"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}