An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging

An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging [12.1]
本稿では,言語固有の大規模言語モデル(LLM)の推論能力の向上を目的とする。 DeepSeek R1は推論に優れていますが、主に英語や中国語のような高リソース言語にメリットがあります。低リソース言語は、英語中心のトレーニングデータとモデル最適化の優位性のため、いまだに保存されていない。
論文参考訳（メタデータ） (Thu, 13 Feb 2025 08:10:45 GMT)
LLMの推論能力を高めるためのモデルマージ+SFT、「We demonstrate that, with only publicly available datasets and a computational budget of $120, it is possible to enhance the reasoning capabilities of language-specific LLMs to match the level of DeepSeek R1, without compromising their performance on target language tasks.」とのこと
Qwen2.5とDeepSeek R1を利用した日本語大規模言語モデル「Qwen2.5 Bakeneko 32B」シリーズを公開｜rinna株式会社でも近いアプローチをとっているように見える。

コメントを残す

コメントを残す コメントをキャンセル